Hi All,
VOCR v3.0.0-beta.4 is out with an exciting new feature called Computer Use.
You can now ask AI to control apps using mouse and keyboard commands.
Check out the demo where I put VOCR through a series of different UI tasks.
https://github.com/user-attachments/assets/c465d6e8-236c-4a93-980b-ef237f5c87ef
If you already have VOCR installed, you can update it by simply running Check for Updates.
If you are a new user, you can download the latest VOCR here: https://chigkim.github.io/VOCR/
You can read the release notes to learn more about all the changes: https://github.com/chigkim/VOCR/releases/
You will need an API key from OpenAI or Claude to use this feature. In my testing, Gemini and local models are not quite there yet.
It's not perfect, but I was able to perform simple tasks that were not accessible with VoiceOver.
I would love to hear what kinds of things you are able to do with it.
Hope you enjoy!
One more thing: I created a computer use add-on for NVDA users as well. To download, the NVDA addon, go to the releases page and search for Assets. https://github.com/chigkim/NVDAComputerUse/releases
Comments
actually, cool!
This is super awesome! I just watched the demo, and I am thoroughly impressed.
How about Mistral AI?
Hi,
Thanks for sharing the demo. I wonder, can you use an API key from Mistral AI?
Dave