VOCR 2 alpha

By mr grieves, 13 January, 2024

Forum
macOS and Mac Apps

I noticed that VOCR 2 has an alpha out:
https://github.com/chigkim/VOCR/releases/tag/v2.0.0-alpha.10

It has a lot of new features. For starters, you can now change the keyboard shortcuts, which should be handy. I've always struggled with Cmd+Option+Shift+W.

And, somewhat inevitably, it now has some AI built into it. To use ChatGPT you must supply your own API Key, or if you have the stomach for it you can set up a local chatbot.

I think the idea is that you can then ask questions about the captured screen. Also, if you are in Finder, you can open an image with VOCR to have it escribe it.

If you've not used VOCR for a little while don't forget to turn screen curtain off! Which I always manage to do, and then I think it's broken until the penny finally drops.

The ChatGPT integration isn't working for me, but as it uses GPT 4 maybe it's because I'm only on the free plan. I'd be curious to know if anyone else has tried this and how it works?

Options

Comments

By a king in the north on Thursday, January 11, 2024 - 06:50

Hey, billing for the API is completely different from chatGPT. Go into your account and setup billing for the API if you want to use it.

By mr grieves on Thursday, January 11, 2024 - 06:50

Ah, thank you - I had no idea about that. I guess I had just always presumed that they were the same thing. That certainly explains why it's not working then as I've got no tokens left.

I'm not desperate to add my payment details to OpenAI right now, but I guess I could be persuaded if it's worth it.

Would be interested if anyone does give this a go.

By Quinton Williams on Thursday, January 25, 2024 - 06:50

Whenever I try the github link, it says page not found.
Is anyone else experiencing this?

By Ramy on Sunday, January 28, 2024 - 06:50

Here it is working ok

By mr grieves on Sunday, January 28, 2024 - 06:50

How does it compare to, say, Be My AI? Is it the same level of detail? I need to give it a try. The only problem I have with something like this is the privacy. Usually if I'm on my Mac, I would want to get descriptions for images sent to me at work (screenshots or whatever). But I'm worried about sending that sort of thing up to OpenAPI in case I am effectively releasing it into the wild to go roam free on its own.

By mr grieves on Sunday, January 28, 2024 - 06:50

I didn't think Be My AI was a straight OpenAI thing - I heard they had made some modifications or something. Not sure if that means they have their own model, or if it's simply that they tweaked the prompt that goes with the image. But I think the idea was that it was tailored to be useful to a blind person rather than the normal API which I guess is just generic.

I have an API key, thanks - just need to get my card details in so I can start putting 50p coins into the meter. (Apologies that line probably only makes sense in the UK)

When I get a few mins I'll try comparing this with Be My AI for the same image.

I was looking at writing a script to go through a whole batch of images and add captions from OpenAI. It looks easy to do but naturally Apple use a proprietary way to store captions rather than the standard way of embedding meta files in an image which put me off a little. But maybe the photos app isn't the ideal way for us to be trying to reason about images anyway. (The purpose of this was to make it possible to skim images without having to pass each one individually off, wait a bit, go back etc etc)

By mr grieves on Sunday, February 4, 2024 - 06:50

I haven't really been paying attention since my original post and noticed it is now at beta 2 but couldn't find the download link anywhere, so thank you for posting it above @Quinton Williams.

Glad to see a check for updates option in the menu now, so hopefully can avoid having to go to github in future.

By Ron s on Sunday, February 18, 2024 - 06:50

I have installed this but after obtaining my key how and where do I paste it into vocr?
Thanks Ron.

By mr grieves on Sunday, February 18, 2024 - 06:50

Open up the VOCR menu (under menu extras), then go down to settings and press enter, then go to Engine and press enter again. In here there is the option "OpenAI API Key …".

By Ron s on Sunday, February 18, 2024 - 06:50

Thank you mr grieves,.
I couldn't originally find it ehere, but now have following your instructions.
Thanks Ron..

By matt on Thursday, April 25, 2024 - 06:50

hey guys. So i've found the github page for VOCR but the readme is still for v1. Is there updated docs somewhere for v2 Beta2 or whatever we're up too?

By Ekaj on Saturday, May 4, 2024 - 06:50

I've had trouble aiming my phone in the exact direction of the information I need, such as nutrition facts. So I might give this a try again. Is there a proper menu system in place, etc?

By Tyler on Saturday, May 4, 2024 - 06:50

Member of the AppleVis Editorial Team

Rather than a mobile OCR app intended to identify and convey printed text, VOCR is a Mac app that attempts to extract and interpret onscreen text and elements that are not natively accessible with VoiceOver, in an effort to make completely inaccessible apps somewhat accessible.

By Yorkie on Wednesday, December 11, 2024 - 06:50

Hi I have just downloaded and installed the latest build of vocr but after scanning and reding with voiceover I just get the word object read back to me. Does anyone have any suggestions why this may be the case?