Introduction:
AI-powered tools like PiccyBot, Be My AI, and Aira's Access AI are making significant strides in enhancing accessibility for visually impaired individuals. This article explores and compares the capabilities of these three tools, highlighting their unique features and benefits.
PiccyBot: Comprehensive Image and Video Descriptions
PiccyBot is an AI-driven application that describes images and videos. It provides detailed spoken descriptions, allowing users to understand visual content through audio feedback.
Key Features:
• Photo to Voice: Converts photos into detailed spoken descriptions.
• Voice-Activated Queries: Users can ask questions about the images using their microphone.
• Video Descriptions: Offers descriptions for videos up to 30 seconds long, enhancing accessibility for visual media.
PiccyBot stands out for its ability to describe both static and moving images, providing users with a comprehensive understanding of visual content through interactive and detailed descriptions.
Be My AI: Enhanced Visual Interpretation with GPT-4 Vision
Be My AI is a feature within the Be My Eyes app, utilizing OpenAI’s GPT-4 Vision model to deliver detailed descriptions of images and graphics. This AI tool complements the live assistance provided by sighted volunteers.
Key Features:
• AI-Powered Descriptions: Provides comprehensive descriptions of photos, screenshots, and on-screen content.
• Interactive Chatbot: Allows users to ask follow-up questions for more detailed information.
• Cross-Platform Availability: Accessible on both mobile devices and Windows PCs.
Be My AI excels in offering detailed, immediate visual descriptions that enhance users' understanding of digital content. The interactive chatbot feature allows for deeper engagement and a more nuanced understanding of the described material.
Access AI by Aira: Real-Time and Verified Visual Assistance
Access AI is a new feature within the Aira app, providing AI-driven image descriptions along with the option to verify these descriptions through trained visual interpreters.
Key Features:
• AI Image Chat: Users can take or upload photos to receive detailed image descriptions.
• Human Verification: Visual interpreters can verify and correct AI-generated descriptions, ensuring accuracy and reliability.
• Integration with Aira Services: Combines AI capabilities with Aira’s established visual interpreting service for comprehensive support.
Access AI combines the strengths of AI and human intelligence, offering detailed image descriptions with the added benefit of human verification. This ensures higher accuracy and trust in the AI-generated information, making it a reliable tool for various visual tasks.
Interactivity and Detail:
• PiccyBot provides interactive and detailed descriptions of both images and videos, with the ability to zoom in and ask specific questions.
• Be My AI offers comprehensive and interactive descriptions with an AI chatbot that allows for follow-up questions.
• Access AI provides detailed descriptions verified by human interpreters, combining AI efficiency with human accuracy.
Platform and Availability:
• PiccyBot and Be My AI are available on mobile devices, with Be My AI also accessible on Windows PCs.
• Access AI is integrated within the Aira app, available on both iOS and Android, and offers additional human verification for AI responses.
Subscription and Cost:
• PiccyBot offers both free and premium versions with in-app purchases for enhanced features.
• Be My AI is a free feature within the Be My Eyes app.
• Access AI provides free initial access with options for extended use through Aira’s subscription plans.
Conclusion
PiccyBot, Be My AI, and Access AI each offer unique advantages for visually impaired users. PiccyBot and Be My AI leverage advanced AI to provide detailed and interactive visual descriptions, enhancing digital accessibility. Access AI combines AI with human verification, ensuring accurate and reliable information. By understanding the strengths and limitations of each tool, blind or visually impaired users can choose the best option to meet their specific needs and enhance their independence and quality of life.
Comments
Google Lens: Visual Recognition and Search
Google Lens is an AI-powered tool developed by Google, offering a wide range of visual recognition and search capabilities. While not specifically designed for accessibility, it has features that can be beneficial for visually impaired users.
Key Features:
• Image Recognition: Identifies objects, landmarks, and text in images.
• Text-to-Speech: Reads text from images aloud, useful for understanding written content.
• Real-Time Analysis: Provides real-time visual analysis through a smartphone camera.
• Integration with Google Services: Seamlessly integrates with other Google services like Google Photos and Google Assistant.
what about envision AI?
This one is still in beta, but in a lot of ways, it's already turning into my go to after Piccy bot. I was able to use it this morning to share the description I was given of a photo my pastor at church shared on facebook. I know I'm super slow, but I just figured out how I can save FB pictures I'm interested in to my photo library and then use Envision AI, Piccy Bot, or whatever to describe them. Wish that the process was a bit more streamlined, but I'm thankful for the ability none-the-less.
Getting back to Envision AI, the ability to more or less fully customize the assistant is a huge plus for me. There are numerous voice options to choose from, and it basicly lets you write what you want the personality to be like, although I'm not quite sure I've done that correctly yet. Yes, some customization is available in piccy bot as well, but it's far more limited in that one area. At any rate, I have all of these vision assistants in my virtual toolbox, and it's sometimes fun to compare the info each one provides.
Envision Assistant
Is still in beta isn't it? Also, isn't it something different? Last time I tried it, I couldn't really work out what it was trying to be.
Envision Assistant Beta
Envision Assistant Beta, is My go to AI assistant, One can take a image have it described just like, Be My Ai, with the added benefit of asking questions with voice, the AI voice of choice replys
Envission assistant
seemed like a broken thing till a few days back but since couple of recent updates, is shaping up to be an incredible thing. It's exactly what it claims to be--an all-in-one AI assistant developed specifically for the visually impaired.
Interestingly, since you can give a description of your requirements during account creation, it acts with a lot of context awareness.
Yes, I see how talking to it...
is useful. I tried the latest version of it earlier - the sound-effects are quite funky!
I did the whole personality thing at the start, but it got very old very quickly. But the ability to describe your own style is a useful one.
I really like the 'tell me more' quick opton with Access AI. I wonder if AIRA will give us the option to give the AI a style? I might ask them.
Question
What do you mean by, you're able to describe your own style?
Style
During profile creation, you can give a description about yourself and what kind of personality you like for your assistant. For example, in my case, I told it that job-wise, I'm a civil servant and in that context, I need factual, matter-of-fact info while out of it, I'd like my assistant to be humourous and a bit sarcastic. And exactly that's what it's doing. Also, since I told it that my jurisdiction of work is Tamil Nadu, India, it is framing all info with that context in 'mind' (don't know what other word to use).
A different example of a 'style'
If you've always been blind, an AI telling you 'the image is an orange sun in a clear blue sky' might not be that useful. With a personal prompt or style, you could tell it to not use the language of colour in its responses.
am I being mean to Seeing AI?
Not including it in this comparison? It used to be my go-to app for sorting the mail or foraging in the fridge.
I've only stopped using 'short text' because I am training myself to wear my Ray-Ban Meta glasses more.
Re: A different example of a 'style'
What happens when you ask your friendly neighborhood AI to describe images with:
1. An orange
2. A lime
3. A violet
4. An orchid
5. A rose
6. A peach
7. A cardinal
Heh, I could go on and on, but yeah, all items listed above are both a color, and a thing. 😇
Adjectives and Nouns Brian
Would you like an orange? It looks funny through my rose-tinted spectacles, metaphorically speaking.
Brian
Since the AI is intellegent enough to describe a picture in detaile, it should be intellegent enough to distinguish between a colour and a thing. So nothing unexpected happens I guess.
It was a legitimate question
@Assistive Intelligence said, "With a personal prompt or style, you could tell it to not use the language of colour in its responses."
My question was a direct response to that. I was hoping for some honest feedback here. . .
@Gokul, Appreciate your response. Here is to hoping you are correct. 🙂
Actually
I am correct that is. Tested it with gpt 4o. Told it I'm blind and that it shouldn't talk to me in the language of colour. Then attached the picture of a rose and asked to tell me all about the picture. It came up with "This is a rose flower" bla bla bla... Guess this was your concern. Will be interested in testing this by recreating any scenario that anyone can suggest as best as I can.
That's why these things are called large language models. They are capable of understanding dynamic use of language. That's why they're able to respond to natural language queries. Therefore, if at all there's a problem of the sort suggested with these models, it should be treated as a fundamental bug and needs to be reported to the company concerned. (I'm primarily talking about models like gpt and claude, not the stupider ones like Meta AI).
A thought experiment!
I wondered: if I told it the last time I had sight, could it describe everyone it sees in relation to famous people I'd seen.
But as I last had sight in the mid-80s, i decided it was a waste of time. Your milage may vary.
Interesting
a unique way to look at it for someone who had sight at a point. @AI I suppose you should try it. I mean, makes sense as to how it'd change the way visual info is conveyed...
Re: A thought experiment!
Just to make it more pointless and twisted, you could have the AI describe people in terms of if two famous people from the mid-80s or earlier had children together...
I sometimes have to look up the release date of a movie just to make sure whether I watched it with eyes, or what ever it's called that my brain does now when watching a movie.
I'd rather not think about moustaches
Remember Magnum P.I. OB? I do rmember how nice Hawaii looked from Rics helicopter though! Or am I remebering %O? The first one.50?
Tom Selleck
Other people had mustaches in that TV series though.
Really?
Higgins did, but who else? Was the helicopter guy called TC? Or am I thinking of Top Cat? Did he have one?
Magnum's friend, not Top Cat. Although I htink Benny might have.
At one point, all of them
I think Rick had one for a while, but Higgins and TC did all the time. And TC was the name of the helicopter guy.
ios shortcuts support
hi Family. i would be interested to know details of the extent of ios shortcuts support of the ai apps.
which one of them allows sending files or photos and getting text back into the shortcut for further processing
than you! awesome discussion. especially the magnum part - harhar
Working with gadgets…
I've only used be My AI so far. Works well with my air conditioner remote. I observed be my AI did a very good job of identifying the remote control, including the brand name. Does better reading the LCD or LED display than regular Seeing AI or InVision.
Located all the controls and told me where they were relative to each other.
Was able to tell me when night mode is turned on which is indicated by a crescent moon on the display.
I use the app every day to set my air conditioner. I asked if it would remember this remote and its layout for the future but be My AI does not do that. I would like be my AI to learn the devices I use, and how I use them for the future.
Sometimes I have to remind it that the night mode indicator is on the display because it sees the button has the same symbol. But the button doesn't Light up.
I would really like these assistance to remember my stuff and how they work.
It would also be nice if Be My A I was voice controlled and more interactive. It's a bit of a problem taking extra pictures and entering text and submitting it with voiceover.
But it's a really good start and I'm excited to see how well these tools develop.
.
Quick access to all 3!
I've these 3 apps assigned to Action Button, and bottom left and right on lock screen (where flash light and camera use to be). It's great that we can customize the lock screen buttons and Action button to get access to visual info in different ways. Depending on context, I can choose if I want to quickly use PiccyBot, AIRA Access AI, or Be my AI and not have to unlock my iPhone.
75 days and nothings changed!
I think Claude for iPhone needs to be added to this article, but is there anything else on iOS?
Voiceover image description
I wish the image description feature would come to Voiceover at some point. This feature is a TalkBack feature and the image processing is done by Google AI. You get a detailed description of any photo that TalkBack is focused on. Once you are focused on an image, you bring up the TalkBack menu and tap describe image and the results are equally as good as Be My AI.
Looks like it time to charge up my Pixel tablet
Upgrade it, learn to use it (again) and chekc this out. I guess I won't need a GeminI Plus subscription?
nope.
@Lotti no you don't. But it doesn't allow you to take a picture using the camera and get it described.
Guess My Comment Didn't Make it...
Not quite. But anyway...I've used most of these apps with mixed results. I've also used the native camera app. I for one, would greatly appreciate an in-depth audio walkthrough of all these that are available to us thus far: pros, cons, and just examples of how to get a good enough aim in order to snap a picture/image of say, ingredients on a box or can of something. Plus any and everything in between, a "photography for dummies" kind of thing. I for one am a bit baffled by all this. Just the fact alone that the built-in camera app is usable by a blind guy like me is still a bit daunting. I'm not trying to complain here by any means; I'm amazed and impressed that the built-in camera app is so accessible.
Hey Ekaj
If I wote some things up and got Notebook LM to make a podcast, would you give it a chance? Have you heard any of the NLM output? They are quite remarkable.
Of Course
I've not heard of them before, but I'd love something like this and would definitely give it a go.
Just tried NotebookLM
I just tried this tool from Google. Although it is a bit tricky to use with a screen reader and many of the buttons are unlabeled, the results are amazing!
As a quick summary for those who don't know what NotebookLM is, it is an AI tool into which you load up information that you want it to process, ask it to summarize, and/or ask follow up questions. The difference between NotebookLM and many other AI tools is that NotebookLM just processes its responses from the input you give it. Thus it doesn't scour the entire internet or rely on its own trained model.
Besides the ability to input copied text, PDF documents, audio, and other content from Google Workspaces, one truly remarkable ability is that it will generate audio of a two person interactive dialog discussing your input. Truly amazing and a bit scary. The voices are superb and it sounds like a real podcast.
The tool is currently in development and free to use, but things could change.
--Pete
You had me...
You had me at PDF. 💘
NotebookLM is mind blowing
The more I read about it and the more examples of it I hear, the more significant I think it is. I know there are some great podcasts out there and I don't think this is going to replace those, but for us?
As Brian said, PDFs!
LLMs have been the hero that slayed the PDF monster for me, but NotebookLM might be the tool that turns PDF into my fav format!
I've put a 'odcast' about Glidance on my Mastodon feed, for anyone to try...I'll see what I can do with this article about AI, along with some of the others. It might take a few goes to get it right.