As a App developer, I’m curious to know how much the AppleVis community is utilizing VoiceOver Recognition. It can be enabled in Settings > Accessibility > VoiceOver > VoiceOver Recognition, where you can turn on specific recognition features.
This feature enhances VoiceOver by detecting text in images or providing custom accessibility labels for elements that lack proper accessibility properties.
However, turning on Recognition, sometimes introduce new issues like announcing a label twice. Have you faced similar issues? if so kindly share your experience.
Comments
Saying it twice
It's not too much of an issue for me if it describes a properly labeled control because it comes after VO has spoken the label, and I can ignore it, or I don't even hear it because I've already swiped past it. However, I don't like the thought that recognition would be used by developers as an excuse to not label things for accessibility.
Thanks OldBear!
Yeah, I totally agree with OldBear. As developers, it’s our responsibility to provide meaningful descriptions for elements. However, we can’t prevent VoiceOver Recognition from announcing the label again, since it detects and reads content on the screen independently of its context.
I’m curious—how much of an issue is this for you? Does hearing the same label twice feel disruptive, or have you gotten used to it?
No.
Never use it.
Thanks Holger Fiallo!
By default, the Text Recognition feature is enabled, allowing VoiceOver to detect and read text within images. Are you currently using this feature?
I love it, it’s great for…
I love it, it’s great for images on Reddit and similar things like screenshots, it’s OK for inaccessible apps like what I use for my meal plan, and it’s overall really useful, I’d say I use it at least once a day, mostly for images, sometimes for apps.
Every day
I just keep this feature on. If I want to know what a photo is without putting it thru a third party app it will just do it.
Praveen
It does not do a good job with pictures. Do not know why? JAWS does a great job with describing pictures.
I use screen recognition if the app or website is porly coded
It is never a fix for improper coding. the app should always be coded following apple's standards.
A Question for Praveen
Hi Praveen,
What is it you are looking to accomplish? I feel like if we have a better idea of what your question is, we would be able to provide you a more informed answer. Especially if you are developing apps, it would be great to be able to answer your underlying question directly. These recognition features are tools and nuanced, and they all have their place, but these are all supplementary to good accessible design. A VoiceOver-generated image description will never take the place of a properly tagged image in your app, Screen Recognition will never (and should never) take the place of an accessible UI, etc. etc. etc.
I agree with Michael
I agree with Michael. We all would love to help you. Below is the link to making your app accessible with Voiceover. I am of course willing to beta test.
https://developer.apple.com/documentation/accessibility/supporting_voiceover_in_your_app
I use it but...
It's there as a tool to improve apps that might be inaccessible, sometimes it works and sometimes it doesn't.
Just because it exists; does not mean it's going to make your app work with voiceover, you are going to have to put in the work if you want us to use your app/apps.
Answer to Michael Hansen
Thanks, Michael. I’m experiencing issues with Text Recognition—despite setting proper accessibility properties for an image, the feature still detects text overlaid on image. According to an Apple Framework engineer, this is a bug, as it shouldn’t recognize independent text overlaid on an image. I was curious to know how many users actively use this feature and what challenges they face when it’s enabled.
text overlaid on image
I'm having a difficult time understanding why independent "text overlaid on an image" should not be recognized, or why I would not want it to be recognized when I have recognition turned on. If this is something like Facebook attempting to generate alt text descriptions for images in posts, I very much welcome the additional descriptions of pictures and text to possibly fill in information that Facebook's inferior AI has missed. It doesn't bother me at all having two descriptions of an image or text, but if it did, I could turn recognition off for that app, or swipe away from the image before the Apple recognition kicks in.
* I read a little about what overlaid text might mean. If this is something like the HTML CSS effect with a text holding container visually placed on top of an image holding container by way of code, then I can see how this might be a bug. One case that might be a problem for me is if the image takes up the whole screen and all the text on the screen keeps triggering the recognition during swiping between elements. I've never come across that, but I would turn off recognition if it happened. Otherwise, it wouldn't bother me if text was repeated now and then due to recognition. I'm already used to it because of things like the auto generated alt text for images on certain sites.
Voice Over Recognition
I use the voiceover recognition features especially image recognition. I also use screen recognition to read facebook images that are inaccessible with VO. As far as your issue with text overlay on an image; VO should be able to read that text without enabling VO recognition features like screen recognition.
OldBear
I honestly believe you just described somebody’s worst nightmare!
Forget all this technology stuff, I’m going back to my abacus. 😫
Reply to OldBear
Yeah, this is more of a dev-side thing. Like, every New Year we usually show some kind of offer or discount in the app using a nice promotional image. Since the offer changes every year, we don’t hardcode any text inside the image. Designers just give us a clean decorative image, and we overlay the actual offer text programmatically.
Now the issue is—iOS ends up detecting that overlaid text as part of the image(as per iOS it should consider it as separate text ). So VoiceOver ends up reading both the proper accessibility label and the detected text, which feels kind of repetitive. Just wanted to share the context in case anyone else has run into something similar.