Last year Intelligence came to ‘Sight as a service’ with the launch of Be my AI – a new feature added to Be My Eyes and powered by GPT 4 from Open AI.
With BE My AI, users can upload images and have them described. Follow-up questions can then be asked about the same image and new images can be uploaded and questioned.
Be My AI can also be accessed via the iPhone’s ‘share’ feature, which is a great way of getting quick and easy detailed descriptions of cat photographs on social media sites like Mastodon and Threads. Other creatures are available. But why would you want to?
Now ‘Access AI’ is in beta testing. Access AI is a new feature in the AIRA Explorer app that is similar to Be My AI, in the same way AIRA Explorer is similar to Be My Eyes – except that using Access AI is also free, a fact that will become important at the end of this post.
I don’t know which AI model powers Access AI, but from the results, it seems to be one of the ‘frontier’ models – either GPT4 or one of the GPT4-equivilent models. But speaking of AI models, one important thing to remember is – these models haven’t been ‘adapted’ to generate image descriptions for blind and visually impaired people. The ability to turn an image into text is part-and-parcel of what the model has been trained to do. This ability is part of what is meant by the term Multimodal. This means these models will continue to be improved, because computer vision is mainstream and improved Accessibility for blind people is just a side effect.
The question has been, are AI-generated image descriptions truly useful? Are they a toy or a tool?
For me, the answer has been that they are far more than a toy, but still less than a tool. Or at least still less than a reliable tool, which is probably just a tool to a philosopher. I am only a student of philosophy, so I am not qualified to say…
The biggest difference between Access AI and Be My AI is that Access AI gives you the option to ‘verify’ the AI-generated image description with one of AIRA’s Visual Interpreters – this returns either an ‘OK’ message or a revised description is around 30 seconds. Obviously this is being done by a person, so it could take longer.
For me, this ability to double check what the AI is telling me is the final step that takes this technology from something that is good to have, nice to have, very useful, great for enjoying ‘Caturday, to a tool.
A useful tool that I could depend upon to do actual work. With the AI-generated descriptions, there is always the chance it isn’t telling you the truth, that you will look silly! I generated a copy of the famous ‘HOLLYWOOD’ sign, but with my name, CHARLOTTE instead. The AI described is perfectly, it sounded great. But when I sent it to a friend who had sight, she told me Charlotte wasn’t spelt correctly. When I went back and did it again, the AI still told me lies – until I challenged it with the truth, then it was very sorry and of course you are correct, the sign actually read ‘CHARLOTEE’ all along.
Now, with Access AI from AIRA Explorer, I could be confident that my image looks like I think it looks – obviously I wouldn’t want this when I am taking a selfie! But it does mean that I am now able to do actual work using ‘Intelligent Sight as a service’ for free, 24-7.
It has taken thirty-five years to get here, the Reading Edge was amazing back in pre-history (the early 1990s) but what we have now is so much better. What will we have next? What do we want next? What will part III of this series cover? The answer is obvious, a blind person could see it!
As we have seen here, a blend of human and machine is the most useful, the next step is to make this technology wearable. After that, I’m thinking implants – I am thinking Borg Queen!
Comments
beta testing?
So how does one sign up as a beta tester? Or rather, do we have an option to do so? Also I wonder, since these models are trainable, are they doing anything to train them using the human-generated input? I mean, that could, in theory, make the whole human lair redundant sometime in the future...
can I join the beta testing
hi! all,
can I join the beta testing..
and how to do it?
Sign Up In The Aira App
Like the subject says, sign up for the beta in the Aira app. I signed up when the option first appeared in the app. I was chosen as a beta tester a couple of weeks ago.
if I don't have the app
if I don'thave the app.
can I sign up via web page or something?
If you don't have the app
You cannot use the feature anyway.
Ming, not sure this is available where you live
Ming, it's only available in the U.s. Canada I think? Australia and New Zealand.
I see!
ok! got it!
so... I will stay on be my AI then
Detailed capture advantages?
Does anyone know what the advantage of this is for someone that is completely blind? I mean, obviously someone else taking a photo for you, but that isn’t really the point at least in my mind.
I wonder
Why does Be My AI not offer a camera switch feature? Also, when is an app going to offer AI-assisted photography services to totally blind folks?
@Ming
You can sign up for the app from anywhere in the world using your email, but you can access the visual interpretter services only in the 4 countries mentioned.
Well...
Question, doesn't the Be My Eyes app offer a call volunteer feature? I mean, wouldn't this, technically, work the same as having a visual interpreter? One thing that I would like to be able to do in the future, is to be able to upload more than one image in a chat without clearing the entire thing. Sometimes, I have two or more images that I would like to compare, so that would be convenient to have directly on the application
Advantages Of Detailed Capture
I am totally blind and think there are several advantages to the detailed capture screen. I use it all the time.
1. You can choose the camera that you want to use.
2. You have control over the flash.
3. You have access to the view finder.
Having access to the view finder is probably the biggest advantage in my opinion. It allows you to make sure you are getting the photo you want. You have that extra assurance that what you want to be in the photo will be there. I hope that BeMy AI finds a way to add this feature to their app.
current bugs&feature deficiencies in Access AI V Be my AI
I have submitted all below bug reports to support@AIRA.io asking their developer team to fix these in the next AIRA app update on IOS. I’m on IOS.
1. Description of n image from a web page through the share sheet’s ‘describe with Access AI’ is currently broken as all it does is open the AIRA app on home tab and nothing further.
2. Description of screen capture is similarly broken with same result of the AIRA app opening up on home tab and nothing further.
3. Voiceover focus seems to open land elsewhere on the access AI tab after double tapping send button after typing a follow up question to an image description from quick or detailed capture requiring user swipe around to land voiceover focus on the AI’s response to user’s follow up question. Highly inefficient and should be considered a bug.
4. No SIRI shortcuts! Be my AI has their own ‘Ask be my eyes’ as well as a close beta tester’s DIY shortcut ‘virtual assistant’. Both these Sir shortcuts allow the user to ask any question such as ‘describe in detail any plants visible in the image’ such that all subsequent sequential actions are performed automatically-> be my eyes app opens on the be my AI tab-> camera is activated-> AI response describing only what user asked for in the user question in the beginning. I’ve asked Aira support to convey my request to have this implemented in the next AIRA update to their developer team.
Does this work with AIRA on other hardware?
I only have it on iPhone ATM. But wiht it coming to the ARx AI Wearable Camera and the Ray-Ban Meta smart glasses, it would be good to know.