Free OCR App Voice – MASSIVE Update after 2 years

By Shalin Shah, 14 June, 2019

Forum
iOS and iPadOS

Hey guys! MASSIVE Update to the Free OCR App Voice.

For those of you guys who don't know me or have never heard of Voice, basically here's the summary. Four years ago, when I was 15, I built an OCR app called Voice. I made it completely free, posted lots of updates about it, and tried keeping it up to date with the latest iOS software updates. I even tried building advanced features like a field of view report system, book mode, and image autocorrection, etc. to offer the best OCR reader experience to you guys for free (compared to very expensive alternatives like KNFB Reader). During those two years, I was able to become very close to the blind community and those were definitely some of the best experiences of my life. Thanks to you guys, I was able to learn a lot about the community as well as about app development.

Unfortunately, since I was still learning programming, the app itself was still very buggy and crashed a lot. I tried my best to give you guys the best features for free but my programming skills were not good enough at the time. As years went by and it was time for me to apply to colleges, I slowly stopped being able to support Voice and it slowly stopped working. Over the last year, I have received thousands of emails from you guys saying the app is completely broken.

Well, I have some awesome news. Over the past few months, Voice was completely remade from the ground up, with the best and latest technology. The OCR quality is spectacular, even on completely distorted, badly focused, and incorrectly angled photos with poor lighting. You can also talk to Voice now! So after opening the app, try saying the words "Take Picture." Voice will take a picture from your Voice command. Then instead of pressing the Read button, just say the word "read". And it will automatically start reading it to you. There are only 2 voice commands right now, but I am including more in the next update. Voice can also give extremely accurate field of view reports when it detects a document in front of you. Lastly, it works in over 30 major languages and gives you the most powerful OCR reading tools right in your pocket.

After long debating this with myself, I have decided to make the app free for only the next 24 hours. After that, it will be on the app store for $4.99. I have always wanted to keep this app free for you guys, but the problem is that maintaining the app becomes very difficult when it is completely free. I would rather give you guys an extremeley low cost app that is the absolute best and regularly supported, rather than a completely free app that is mediocre and unsupported. Those of you that install it while it is free will also get future updates for free. Let your friends know so they can also get it while it's free!

Here are the specific list of features:
1. New and improved OCR quality! Be amazed at how accurately the app will read your documents. You no longer need to worry about low lighting and bad focus, Voice corrects it automatically and gives you very high-quality readings.
2. Voice control! Now you can take a picture by simply saying the word "Capture" or "Take picture". Voice will literally snap a photo at your command. Then simply say the word "Read" and Voice will start reading it to you. Right now there are only 2 Voice commands but I will be adding more in future updates.
3. Smart field of view report! Just point your phone at a document. Voice will automatically tell you when all 4 corners of the document are visible. Works even when a page is folded into small squares–it finds the largest rectangle, so you will always be able to get an accurate sense of what's in front of the camera.
4. Book mode! Instead of taking one photo, just simply keep on taking photos. Then when you say "Read," Voice will be able to read them all to you one by one.
5. Built from the ground up to be Voice Over compatible to provide the best experience.
6. Powered with superpowered technology, Voice has better OCR quality than most other expensive paid services.
7. Automatic vertical and horizontal column detection for reading different columns in a newspaper.
8. Voice can process and read the text in over 30 major languages. You can also change the speaking rate Adjustable speaking rate. Just customize the language and speaking rate to your liking in the app Settings.

Now that I have given you the pro's, here are some cons:
1. Although I have done my best to squash bugs, this is only the first version, so there might be small bugs.
2. You can copy and paste the processed text from Voice to other apps, but there is no way to directly export as PDF or PNG from within Voice. This will be added in a future update very soon.
3. The app requires an internet connection. This is how Voice achieves a vastly better OCR quality that only gets better over time.
4. Depending on your internet connection, processing many images may take up to 15 seconds. I'm very sorry about this and I'm working on making it a lot better.
5. Right now the Voice control feature only supports two actions, taking a picture and starting the reading process. I plan to make the entire app Voice controlled within the next few updates.
5. The app will cost $4.99 after 24 hours. I really want all of you to be able to download it for free though so tell all your friends to install it before 8AM tomorrow June 15th!

Here is the link to the app on the iTunes App Store: https://itunes.apple.com/us/app/voice-take-picture-have-it/id903772588?mt=8.

My email is shalinvs@gmail.com.

I hope you find this application helpful. I am open to any suggestions and feedback, feel free to be as brutal as you like. I will respond to every email. I plan to bring even more exciting stuff in the very near future. Thank you for your time.

P.S. If you like the app, it would be so helpful if you could write a quick review on the app store.

Options

Comments

By Remy on Wednesday, June 26, 2019 - 22:41

First, 4.99 is a small charge for an app that actually works well. KNFB did not and was not worth the ridiculous cost. If I had the previous Voice app, will this new one just be an update, or do I need to install it from this link? Thank you for yoru support. the OCR app market is much improved since 4 years ago, so it will be interesting to see how this one compares to others. Either way you're attempting to do something great here, and it is appreciated.

Hey Remy,

Thank you so much for taking the time to comment and check out the app. I deeply appreciate it. It's the same app, so if you have the previous Voice app, this is just an update. The link should still take you to the correct page where you can update it. After you have some time to check it out and play around, I hope to hear your feedback about how the app compares to other apps in the market. It will be nice to hear any flaws you discover, so I can immediately get started on making improvements and shipping them out over the next few days. If I'm going to charge any money at all, I want to deliver as much value as possible. Again, thank you for your time.

By gregg on Wednesday, June 26, 2019 - 22:41

If you kept the app on your phone as I did, the version will show as an update.

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by gregg

Hey gregg,

Thank you, I really appreciate you taking the time to check out the app!

By Kevin Shaw on Wednesday, June 26, 2019 - 22:41

Hi, looks very promising in this iteration. good job. I got a crash the first time, but got it to read a bill. The one thing I wish it could do is distinguish when to read in columns and when not too. e.g. When I'm reading a magazine, it should read by column, but when it's reading line items on a utility bill which is formatted in columns, it should read items straight across.
When VoiceOver is on, tapping the Take Picture button still takes the picture even though double tap is expected behaviour.
Also, is there a way to display the scanned text on screen for easier interaction?

Again, great work.

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

First of all, thank you so much, I really appreciate you taking the time to check out the app!

Very sorry that it crashed–I am furiously working to find any and all bugs so I can release an update to you guys as soon as possible. I will look into updating the column view so that it works a lot better than it does currently. Also, the reason tapping the Take Picture button still takes the picture is actually because when the Voice Over says the words "take a picture," the new Voice command feature is accidentally activated and so it takes a picture. This is a bug and I am fixing it as we speak. Lastly, the scanned text actually does display on the screen! When Voice starts reading the document, if you just tap pause, then take Voice Over to the center of the page, you will notice the scanned text is right there so you can interact with it more easily.

Let me know if you find anything else. Again, thanks for checking it out, hope this is useful!

By Dawn 👩🏻‍🦯 on Wednesday, June 26, 2019 - 22:41

Hi!
I'm downloading it now. I'm always looking for something in the Ocr. app. Never can have enough of them. Well, I have a question. My main use of Ocr. is to read memes on Facebook or via email. And I import them from my photos. Will this app gain that capability? I look forward to seeing what happens with this app!

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by Dawn 👩🏻‍🦯

Hey Dawn!

Thanks so much for giving Voice a download. Actually, you can import screenshots of Facebook and email memes and read them out loud with Voice right now! Let me know how it works for you and if you need anything else.

By Patrick Bouchard on Wednesday, June 26, 2019 - 22:41

Just downloaded this and found it was a great excuse to open some mail that I've been procrastinating instead of opening for a couple of days.

I think my favourite feature is being able to say "Take a picture" instead of having to tap something, thus causing my phone to shake slightly which can ruin what was once a perfect field of view. Though I fear Apple may have made your efforts obsolete with the upcoming addition of voice control in iOS 13! Not that that's the only great feature though, the OCR was very accurate.

I did run into a couple of crashes where the app quit instead of displaying the screen with the processed text of my image. It may have been because it didn't see any text, though for obvious reasons I can't confirm it.

I've got one feature request. Can you add a setting to toggle on and off automatic reading of text once your image(s) have been scanned? Sometimes I don't want to listen to the whole thing, and trying to find the pause or go back buttons while listening to 2 voices talk over each other is an annoying experience. I'd rather find the text on the screen and listen to it that way, and can just flick away from it if i'm no longer interested.

Great job updating this app! I'll admit I've never heard of it before, but it's definitely worth $4.99. Just gotta fix those crashes :D

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by Patrick Bouchard

Hey Patrick!

Thank you for the kind words and for checking it out. I'm glad it was helpful. Even though iOS 13 will come with Voice control features, you guys can still get the benefits a little earlier with the current Voice version until Apple officially releases it. That makes the efforts worth it.

Also I'm very sorry about the crashes and I'm working to find and fix them as quickly as I can. Do you mind letting me know what device you're using and what iOS version you are on? That would definitely help me squash the bugs faster.

Lastly, I've actually already finished implementing exactly the feature you just recommended! I am shipping out an update tonight that will allow you to toggle the automatic readings on or off (by default it will be on but you can change it in settings). That way, you can just have Voice Over read it to you instead of the app's Text to speech.

By Patrick Bouchard on Wednesday, June 26, 2019 - 22:41

In reply to by Shalin Shah

I've got an iPhone 7 running iOS 12.3.1

Great to hear that feature is already implemented! I'll keep an eye out for that update.

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by Patrick Bouchard

Awesome, working on fixing the bugs as we speak. Apple takes a couple of days to review the app, but it should be out by the end of the weekend. Thanks again!

By HarmonicaPlayer on Wednesday, June 26, 2019 - 22:41

since i have an ipod 6 gen and it can't run ios 13, having an app thats voice controlled is a grand thing to me:)

By Dawn 👩🏻‍🦯 on Wednesday, June 26, 2019 - 22:41

I doownload the photos from Facebook or save them from my emails. Can you still imfort them? That's generally how I do it.

By ming on Wednesday, June 26, 2019 - 22:41

hi! all,

when I take a pictore and it has text in it.
it sad:
connection is bad.
can not detect...
something like it!

By cool cat on Wednesday, June 26, 2019 - 22:41

HI! I did try this app out for a quick second and the OCR worked well. I was wondering if the data got processed over the internet or not? If the data does get processed over the internet I guess that means someone has the potential to look at the data right?

By peter on Wednesday, June 26, 2019 - 22:41

Great to see that you've gotten back into developing this app. Must be very rewarding and fun to make such a contribution!

Just wondering if it is possible to port the image recognition over to the phone itself so that one doesn't need an internet connection with the accompanying privacy issues.

BTW, what is the OCR engine you are using? I wonder if any open source alternatives exist that are comparable and might be able to run on the phone itself.

Anyway, nice job!

--Pete

By gregg on Wednesday, June 26, 2019 - 22:41

voiceover shows the Pick from photo library Button right after the settings. After looking at the share options, as a suggestion, either an open or import/add to Voice option would make reading scanned documents as another option.

Hey Ming,

Thanks for giving the app a download and commenting your feedback. Really sorry that it wasn't able to work for you. Do you mind letting me know what device and iOS version you are using? Also, are you connected to the internet when you are using it? That way I can fix any bugs that I find as quickly as I can and thus deliver the best experience to you. I am looking into the problem as we speak.

Hey cool cat,

Thank you for giving the app a download–really appreciate you checking it out!

That's a great question! And I'm sure it's one that many people are probably curious about. So first of all, since you never sign in or create an account on the app, this means that Voice does not collect any personally identifiable data from you. However, in order to deliver the greatest OCR results, it is necessary to temporarily send the image over to a server, where it is then processed and converted into text. As soon as the server performs this conversion, the image is immediately deleted from both the server AND the phone. Once you leave the app, the text is also deleted. In short, the answer is yes, if someone were to theoretically look on the server at the exact same time when you asked Voice to process an image, then they would be able to look at the data for about 5 seconds before the data self destructs. But they wouldn't even know whose data they were looking at, because users do not have accounts on Voice. So although it is theoretically possible, it would be highly highly unlikely that anyone could ever specifically target your data and do anything meaningful with it. Additionally, it would cost me a massive fortune in order to pay the server costs if I was saving any sort of user data. I opted for making the data self-destruct instead. I hope this addresses your concerns, and of course let me know if have any more questions!

Hey Pete! Thanks so much for checking out the new version of Voice! Yes, it is beyond rewarding and fun to create great technology that makes any sort of impact. It's even better because AppleVis literally gives me direct access to the community where I can very quickly get feedback and improve things for you guys. It would be a lot more difficult to create great products for low vision people if I couldn't talk to you guys directly. So honestly, the actual credit for Voice goes to AppleVis for proving such a phenomenal platform for sharing this stuff and to you guys for posting your questions, comments, and concerns.

To answer your questions, there are definitely privacy issues that do exist as with anything that requires any sort of server. However, as I explained in my reply to cool cat, I think the chances of anything going wrong in terms of privacy is so low with the way Voice was constructed. Additionally, there are a lot of added benefits to using a server that are quite significant. For one, there is basically only one industry-wide open source OCR engine called Tesseract, and almost all apps that do offline OCR use this. Tesseract is not very good, and you will never get really good quality readings from that. This is why a lot of times people complain in the comments section and write reviews that say the app is just saying a bunch of gibberish. I know the OCR quality of Voice is not perfect yet. But since it is using a server, it is constantly being retrained with new data, which means it learns and improves by itself over time. This is something that is quite significant because if a lot of people use Voice over a long period of time, Voice will learn and improve by itself. Offline OCR's cannot do this. To answer the final question, Voice uses a custom OCR engine.

Again, thank you so much for supporting Voice and for asking great questions that allow me to clarify things for everyone!

By Matt on Wednesday, June 26, 2019 - 22:41

Hello. I tried the app when it was first announced and found the OCR to be rather good. Not great, but very good. But then, no OCR is going to be 100 percent perfect. i do, however, have some suggesttions that might make the app even better. First, I believe it would be beneficial to have some guidance announced when trying to line up a document. I was eventually able to get all four corners of a document in focus, but it took me a while without any guidance given. Second, i noticed that whenever I opened up the app it automatically switched to the phones speaker, even when I was wearing headphones. I had to unplug the headphones and plug them back in to get voiceover to speak through them again. Third, i couldn't get the voice commands for taking a picture to work, even after enabling speech recognition as asked by the program at initial startup. If these suggesttions and possible bugs could be taken care of, I believe the app could be comparable to other OCR apps out there. thanks for your excelent work and I look forward to seeing what this app will be able to do in future updates.

By DMNagel on Wednesday, June 26, 2019 - 22:41

I'd prefer Samantha to shut up and let my Daniel do the reading. It's not good to have 2 voices talking over each other.

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by DMNagel

Hey DMNagel,

Thanks for giving Voice a download and for commenting your feedback here! I agree, hearing Samantha and Daniel talk over each other can get quite annoying. That's why I spent a few hours today building a cool new option in settings that can turn off the default Text-to-speech! This way, you can just let VoiceOver do the reading if that's what you prefer. It should be out very very soon, along with a lot of other bug fixes. I will post about it on AppleVis to let you guys know. Thanks again for the download and let me know if you need anything else!

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by Matt

Hey Matt,

Thanks for giving Voice a download and for commenting your feedback here! Actually, Voice does give guidance when trying to line up a document. If you have the entire document in front of the camera, then Voice should automatically see it and tell you "4 coners detected". If Voice does not say that, then most likely you do not have the entire document in the camera's view (or your camera is too far away from the document). Maybe try pointing the camera at another document and give that feature another go! Also, the speaker bug has been completely fixed and I'm just waiting on Apple to approve the update so you guys can install the new version. Lastly, I'm not sure why the Voice commands were not working, that seems really strange. Were you saying the right commands? If so, give me some time to look into the issue and figure out what is wrong with it. Again, thanks so much and let me know if there is anything else I can do for you!

By gailisaiah on Wednesday, June 26, 2019 - 22:41

I really do like this reader! The only thing I've encountered is when VO says, "Double tap to take picture", it will take the pic even though I did not double tap the button. Thanks, though, for this app. I appreciate all your hard work and it has already been so helpful!

By Lee on Wednesday, June 26, 2019 - 22:41

Hi,

I also have this issue. It's odd. When flicking right you get to that button and without doing anything it takes a picture. No reason why. Seems to do it everytime as well.

By chris R on Wednesday, June 26, 2019 - 22:41

In reply to by Lee

See comment 6. I'm guessing that's the bug mentioned.

By Shalin Shah on Wednesday, June 26, 2019 - 22:41

In reply to by Lee

Hey guys,

Thanks so much for giving Voice a download and for the feedback! So since so many of you have mentioned this bug, I just wanted to write a quick note on why this is happening. When Voice Over gets to that button and says "Take a Picture. button." out loud, the Voice command feature thinks that you are telling it to take a picture. So it takes a picture. I have fixed it already and that version should be out as soon as Apple approves it.

Thanks again!

By Dennis Hoffmann on Wednesday, June 26, 2019 - 22:41

Hi, Shalin. Thank you so much for creating this app. I find that at least in my case that Voiceover is still causing the program to take a picture before I am ready for it to do so. When I open the program, Voiceover gives a brief description of the two voice commands, and the picture is taken, and it also mentions the read command while the picture is being processed on the server. Also, I would like to be able to set the flash to either be on automatic or to on. I have discovered that each time I begin the program, the flash is set to off. In the case of the picture being automatically taken, maybe turning hints off will stop the picture from being taken before I am ready to take it. Thanks very much for the work you are doing to make this app great for us.

Dennis Hoffmann