I noticed that VOCR 2 has an alpha out:
https://github.com/chigkim/VOCR/releases/tag/v2.0.0-alpha.10
It has a lot of new features. For starters, you can now change the keyboard shortcuts, which should be handy. I've always struggled with Cmd+Option+Shift+W.
And, somewhat inevitably, it now has some AI built into it. To use ChatGPT you must supply your own API Key, or if you have the stomach for it you can set up a local chatbot.
I think the idea is that you can then ask questions about the captured screen. Also, if you are in Finder, you can open an image with VOCR to have it escribe it.
If you've not used VOCR for a little while don't forget to turn screen curtain off! Which I always manage to do, and then I think it's broken until the penny finally drops.
The ChatGPT integration isn't working for me, but as it uses GPT 4 maybe it's because I'm only on the free plan. I'd be curious to know if anyone else has tried this and how it works?
Comments
openAI API
Hey, billing for the API is completely different from chatGPT. Go into your account and setup billing for the API if you want to use it.
Re: API pricing
Ah, thank you - I had no idea about that. I guess I had just always presumed that they were the same thing. That certainly explains why it's not working then as I've got no tokens left.
I'm not desperate to add my payment details to OpenAI right now, but I guess I could be persuaded if it's worth it.
Would be interested if anyone does give this a go.
link does not work
Whenever I try the github link, it says page not found.
Is anyone else experiencing this?
disregard my last comment
Apologies for my previous comment. I found the link for vocr 2.
Here is an updated download link.
https://github.com/chigkim/VOCR/releases/download/v2.0.0-beta.2/VOCR_v2.0.0-beta.2.zip
working here
Here it is working ok
How does the image recognition compare?
How does it compare to, say, Be My AI? Is it the same level of detail? I need to give it a try. The only problem I have with something like this is the privacy. Usually if I'm on my Mac, I would want to get descriptions for images sent to me at work (screenshots or whatever). But I'm worried about sending that sort of thing up to OpenAPI in case I am effectively releasing it into the wild to go roam free on its own.
Be My AI
I didn't think Be My AI was a straight OpenAI thing - I heard they had made some modifications or something. Not sure if that means they have their own model, or if it's simply that they tweaked the prompt that goes with the image. But I think the idea was that it was tailored to be useful to a blind person rather than the normal API which I guess is just generic.
I have an API key, thanks - just need to get my card details in so I can start putting 50p coins into the meter. (Apologies that line probably only makes sense in the UK)
When I get a few mins I'll try comparing this with Be My AI for the same image.
I was looking at writing a script to go through a whole batch of images and add captions from OpenAI. It looks easy to do but naturally Apple use a proprietary way to store captions rather than the standard way of embedding meta files in an image which put me off a little. But maybe the photos app isn't the ideal way for us to be trying to reason about images anyway. (The purpose of this was to make it possible to skim images without having to pass each one individually off, wait a bit, go back etc etc)
Thanks for the updated download link
I haven't really been paying attention since my original post and noticed it is now at beta 2 but couldn't find the download link anywhere, so thank you for posting it above @Quinton Williams.
Glad to see a check for updates option in the menu now, so hopefully can avoid having to go to github in future.
Open AI key - How to copy into vocr
I have installed this but after obtaining my key how and where do I paste it into vocr?
Thanks Ron.
Re: OpenAPI Key
Open up the VOCR menu (under menu extras), then go down to settings and press enter, then go to Engine and press enter again. In here there is the option "OpenAI API Key …".
Open AI key
Thank you mr grieves,.
I couldn't originally find it ehere, but now have following your instructions.
Thanks Ron..
hey guys. So i've found the…
hey guys. So i've found the github page for VOCR but the readme is still for v1. Is there updated docs somewhere for v2 Beta2 or whatever we're up too?
You can download at https:/…
You can download at https://github.com/chigkim/VOCR/releases/.
It's still in Beta, so the main page doesn't have it.
Will Have to Try This Out Again...
I've had trouble aiming my phone in the exact direction of the information I need, such as nutrition facts. So I might give this a try again. Is there a proper menu system in place, etc?
Rather than a mobile OCR app…
Rather than a mobile OCR app intended to identify and convey printed text, VOCR is a Mac app that attempts to extract and interpret onscreen text and elements that are not natively accessible with VoiceOver, in an effort to make completely inaccessible apps somewhat accessible.