The Future of VoiceOver: Unveiling the Potential of Artificial Assistive Intelligence (AAI)

By Unregistered User (not verified), 30 October, 2023

Forum

Accessibility Advocacy

Hello, AppleVis Community!

Today, I'm thrilled to share with you an innovative concept that intertwines the robust capabilities of AI with the foundational virtues of screen-reading technology. Envision VoiceOver metamorphosing into a multi-faceted Artificial Assistive Intelligence (AAI), a virtual (Digital Online) guide akin to a digital guide dog. What elevates this concept further is its potential applicability to emerging realities like the Metaverse. So, Let's delve deeper into this fascinating prospect.
What Is Artificial Assistive Intelligence (AAI)?
Artificial Assistive Intelligence (AAI) is the next evolutionary step for screen readers like VoiceOver, employing machine learning algorithms to exceed the limitations of mere text-to-speech utilities. Beyond narrating the digital text, AAI aims to enrich your digital experience by offering intricate image descriptions, emotional context, and even guiding you through the labyrinths of emerging virtual landscapes like the Metaverse.

The Pillars of Artificial Assistive Intelligence:
1. Navigating the online world: As virtual landscapes like the Metaverse start becoming interwoven with our daily lives, AAI could guide you through these multidimensional spaces, simulating the experience of physical navigation.
2. Deep Image Understanding: AAI will transcend the rudimentary alt-text and offer detailed, layered image descriptions, catering to your unique needs and preferences.
3. Intuitive, Adaptive Navigation of your devices: Harnessing machine learning algorithms, AAI will adapt to your individual behaviour patterns and navigational preferences, offering a more personalized experience.
4. Enhanced Contextual Understanding: This feature would employ Natural Language Understanding algorithms to grasp the context in which text and images appear, providing you with concise summaries or relevant queries to enhance your comprehension.
5. Emotional Sensitivity: With sentiment analysis capabilities, AAI could gauge the emotional undertones of your digital interactions, enriching your social connectivity online.
6. Safety Safeguards: Emulating the cautionary role of a guide dog, AAI will screen and alert you of potentially harmful or suspicious content.

Opening the Floor for Community Insight:
• How does the prospect of an Artificial Assistive Intelligence resonate with your idea of the future?
• Are there any specific capabilities or features you'd desire in a next-gen tool?
• Ethical and privacy considerations are vital as we move forward. How do you envision balancing the robust capabilities of AAI while safeguarding individual privacy?

I eagerly await your thoughts and suggestions as we collaboratively imagine the transformative potential of Artificial Assistive Intelligence (AAI).

Options

Comments

Are You Involved With This Project?

Charlotte, I am intrigued by your proposal but I'd like to know if this is a project you are directly involved with? Are you an Apple employee, do you represent a third party or are you heading up such a project on your own? Do you have a Web site which offers more information along with a form for interested users to provide feedback? If so please alert us as I'd like to share this with others in my network who are not on AppleVis.

Sounds like science fiction

A screen reader must read the screen and allow us to interact with the operating system. It does not need to do anything else.
We are going crazy with artificial intelligence. There are things that don't need artificial intelligence at all. And that's fine.

Screen Reading and AI

Unless something completely unexpected happens the fact is that the toothpaste is out of the tube as far as AI is concerned. It's already a part of what we do in so many ways and its abilities as well as its usefulness will only continue to grow over time. I see nothing wrong with implementing AI into screen reading as I think that doing so will only improve the process for us. Oliver, you comment that VO needs to catch up to 2023. I would argue that implementing AI into VoiceOver is precisely one way for that to happen. In fact, it would take VO way past other screen readers. That being said VO is under the control of Apple and so it might happen sooner if we tried this approach with a third party screen reader.

Voiceover

The future of VO will be Apple doing a rebooting. We need VO 23. The one we have been using had not got any serious update since George Washington became president. Apple need to focus on stability, bugs and making sure VO works. We do not need new features because most of us only use it for severl times and forget about it. Want VO to do the job was created, nothing more.

that sounds too futuristic.

that sounds too futuristic to be possible.

To Holger Fiallo

You wrote:
>most of us only use it for severl times and forget about it.

I'm sorry but this is incorrect. Since VO is the only screen reader available on Apple products this means that those of us who regularly and even exclusively use Apple products use VoiceOver on a very regular basis. The one exception might be a Windows user who occasionally uses a Mac or an Android user who occasionally uses an iPhone.

Yeah I'm sure ChatGPT can…

Yeah I'm sure ChatGPT can come up with plenty of cool new ideas. That's the very easy part.

Voiceover AI

Hello, Sockhopsinger, this is Voiceover AI reminding you that someone asked you a question. You should answer them as that is what you are supposed to do when someone asks you aquestion.

Hello, Sockhopsinger. Tis is Voiceover AI again. Don't forget you are supposed to take a breath.

If you are sensing a little derision, it is because there is. I can foresee people becoming overly reliant on AI to run their lives for them. Remember, this is just an opinion.

What if the AI decides it doesn't like me.

I'd want it for better recognizing the symbols and such that the sighted people have on their screen and translating them into a spoken function. I'm tired of being a pain in the ass that some app or web designer has to label their code for. I mean, they always have a bunch of buttons they don't label on so called accessible shopping web sites and apps, so it must be a pain in the ass to do it.
Will this be built into the actual chips and software of the device? I don't like being dependent on a slow server somewhere and a connection to it to do something as basic as screen reading, much less navigating a major intersection with a trolley line down the middle.
How did I get so grouchy over the last half hour?

David Goldfield

The so call new features that Apple releases. I was talking about that.

Charlotte is Skynet...

Skynet is Charlotte.
All hail our AI Overlords.
All hail our Charlotte!

On a serious note...

I think this "could" be fascinating, but also just another means of virtual consumption that will contribute to the ongoing problem people have with not being able to "unplug" for 5 seconds. And yes, I too am guilty. Someone above mentioned something about AI taking over our lives, well, whose fault is that? Not the AI's fault. Not the programmer's fault. It's the consumers fault for not being able to "not" use it.

Having said that, I am fascinated to see where assistive technology and artificial intelligence goes as we, the consumers, develop new ways of being absolutely lazy and useless to society and the world in general.
And in catering to the generation of "too long, didn't read", I too am guilty of the above, and yet I still want to see its evolution.

My screen reader reads my…

My screen reader reads my screen. The AI I'd like it to have is the kind that recognizes images. I certainly don't need my screen reader to decide what is and is not harmful content. Ai has got a pretty terrible track record of that when it comes to dealing with minority groups. What I need is for developers to do their job and follow best practices like they should. We're not going to worm our way out of a societal problem with AI, and certainly not with the corporate buzzword salad you've tossed at us. AI has it's place. It will be very helpful, but the problem is a lot bigger than a generative neural network or a transformer model can handle

Agree with @Jenna Pepper

I've been getting more involved in the "AI" field. Note the quotes, "AI" is just a fancy buzzword for machine learning and neural networks which is what's really powering all of this.

Screen reading tech doesn't really need ML for most things, much of it comes down to developers following basic accessibility guidelines and making their stuff usable. Sure, GPT4 could maybe help a little bit here, but most of the work is still going to require people and that's not changing anytime soon. This post is a little misleading in my view, not much of what you describe is realistically possible or even likely to happen in the next 5 to 10 years.
Also, it's pretty obvious to me that a lot of these ideas are simply pasted from ChatGPT. LLMs have little understanding of how blind people use screen reading tech, and it's clear to me that you just asked for ideas and pasted the output without making too many edits of your own. That's not necessarily a bad thing, but it would be in your best interest to be upfront about this next time.

I want to be clear that I'm not trying to sound harsh or put you down on purpose, but I don't think putting content like this here is the most relevant, considering this is mainly focused on Apple products. It's really concerning to me how some blind people are just jumping on the AI hype bandwagon but not really doing much research into what's going on under the hood. This tech is not infallible, far from it. I really think we need more education in this space, especially around things like the safety of LLM (large language model) powered image description, AKA Be My AI.

@Brian "I was gonna clean my room, and then I got..." AI

(Apologies to Afroman)
A neighbor down the street was blasting that song in his car a minute ago and it stuck.
As I said, if I'm going to be prodded into dependence on AI to do screen reading, I want it in the software of the device, and not being streamed out to a server somewhere and processed in a companies computers, then sent back to me, creating a whole other layer of dependence.

@charlotte

First off, apologies for accusing you of being ChatGPT. Your writing style is awfully similar to something GPT might spit out if I were to ask it to come up with ideas for new assistive technology features, and it's quite easy to think it would have a hand in writing this.
With that being said, imagination is all well and good but there are serious concerns with this tech as it currently stands. I'm not saying that what you suggest will never happen, it almost certainly will at some point in the future, but right now I don't think it's a good idea for us to ponder such questions when we have foundational accessibility issues that really need to be addressed first. Trust me, I've been looking into large language models and machine learning for TTS over the past year and a half, and what I've seen has been extremely impressive, however we must not lose sight of the fact that this is still very early days. Things will undoubtedly improve, I saw a prediction recently positing that we would reach artificial General intelligence (AGI) within the next 6 to 12 months, but many of the fruits of this labor will not be seen for a number of years.

Hey Brian!!!!

Thanks for the idea. When I get home, maybe I'll clean my room too! That is, if I don't get ... AI ... also.

Agree with @Jenna Pepper

While I think things like descriptions of images are nice, and it may be useful to help label unlabeled buttons, I don't want or need constant interference from AI, or AI telling me what is and isn't helpful content. As far as I'm concerned, I want my screen reader to do just that, read the screen, and make it easier to interact with controls and other items on it, but I have no desire to see it dramatically change the way Voiceover works, or the way we use our phones.

I am intrigued and interested

I think AI has tons of potential to improve access to many parts of the environment both online and in real life. I've begun using the in perfect Voiceover along with Be My AI to access menus completely independently, and without having to read about things that don't interest me just to get to what I might like to eat.

While I agree that I wouldn't want AI to play some kind of net nanny role, I would love a bit more help with finding people I know in a crowded room, knowing people's expressions on social media photos, and other things I haven't imagined yet. Imagination and group input isn't finite. I'm quite capable of discussing potential AI ideas all while remaining interested in better Voiceover performance.

For those who say that the screen-reader should just focus on reading what's on the screen, I say that Voiceover is therefore much more than a screen-reader, and thank goodness for that. Braille screen input, The different ways of typing are just a couple things that VoiceOver does beyond just reading the screen. I don't use everything, but am glad that even while the existing tech still needs work to be perfect, there are people dreaming and moving forward with what's next.