Description of App
Seeing AI is a free app that narrates the world around you. Designed with and for the blind and low vision community, this ongoing research project harnesses the power of AI to open up the visual world by describing nearby people, text and objects.
Seeing AI provides tools to assist with a variety of daily tasks:
• Short Text - Speaks text as soon as it appears in front of the camera.
• Documents - Provides audio guidance to capture a printed page, and recognizes the text, along with its original formatting.
• Products - Scans barcodes, using audio beeps to guide you; hear the name, and package information when available.
• People - Saves people’s faces so you can recognize them, and get an estimate of their age, gender, and expression.
• Currency - Recognizes currency notes.
• Scenes - Hear an overall description of the scene captured. Explore the photo by moving your finger over the screen to hear the location of different objects.
• World - An Audio Augmented Reality experience to explore an unfamiliar environment, including hearing objects announced around you with Spatial Audio (requires a device with a LiDAR, and iOS 14+).
• Indoor Navigation - Available on the World Channel, enables you to create routes through a building, like "entrance to classroom", and navigate by following the sound (requires a device with an A9 or later processor, and iOS 14+).
• Colors - Identifies colors.
• Handwriting - Reads handwritten text like in greeting cards (available in a subset of languages).
• Light - Generates an audible tone corresponding to the brightness in the surroundings.
• Images in other apps - Just tap “Share” and “Recognize with Seeing AI” to describe images from Mail, Photos, Twitter, and more.
• Browse Photos - Hear descriptions of photos saved on your device.
Seeing AI continues to evolve as we hear from the community, and AI research advances.
Check out tutorials with this YouTube playlist: http://aka.ms/SeeingAIPlaylist.
Questions, feedback or feature requests? Email us at SeeingAI@Microsoft.com.
Comments
moving around
I noticed that when I try to go to color and flick I got a pop that provided info about different itoms. It is hard to move to the different itoms such as money, doc and color unless you do it very, very fast.
I guess those are the
I guess those are the different tutorials for each mode. They will pop up only the first time.
Onboarding tips appear only the first time
Hi Holger, as Raul mentioned, the onboarding tips only appear the first time. So should be smooth to go between different channels after the first time use.
What's New in Version 3.0
Version 3.0
•Explore photos by touch: This new feature allows you to explore how objects in a photo are arranged. Select "Explore Photo" from the Scene channel, photo browser, or when recognizing photos from other apps. Then, move your finger over the screen to hear where objects are located.
•All new iPad support.
•Get faster access to your favorite features by customizing the order in which channels are shown.
•On the Person channel, you can now teach Seeing AI to recognize someone new, directly from the main screen.
•When recognizing photos from other apps, you will now hear the processing sound as in the main app.
Sounds like this could be a very interesting update. I am eager to hear peoples' opinions about the new version.
Just Updated
Looks like my iPhone just updated this app automatically. Looks pretty cool. My sighted neighbor across the hall has an iPad, and we always hang out together. I've been talking to him about all the tech I use, so think I'll tell him about this update to Seeing AI so that he can check it out if he so chooses. This app seems to be getting better all the time!
More Practice Today
Hi everyone. I just wanted to say that I practiced with this app some more just a bit ago. I got it to successfully scan and read the labels on some food items in one of my cabinets. It seemed to do better on boxes and canned items than on the rice packets that I have. I was able to tap on "More Info" and it read off the box instructions on a box of long-grain rice with chicken and jalapeno. I also got something in Spanish when scanning another much smaller packet, but my Spanish is rather rusty so I don't know what it said. But this app is indeed getting better all the time. I also like the sharing options that are available.
Thank you Microsoft!!!!
Hello everyone. I have been using this app since day 1 and there is nothing like this app. I had switched to android a few months ago and this is one app I dearly missed on android. So when I got the chance I got an iPhone 11 and seeing AI is amazing on this thing. Plus it works very well in the dark. I was still able to read the screen on a car stereo thanks to Seeing AI.
Artificial Intelligence
I opened this app for the first time in a few weeks and it seems it now has a load of AI type things built in, presumably from Seeing AI. For example, in Scene view I can take a photo, it tells me something very brief, and I can go to More Info and then get a Seeing-AI style description. I couldn't find out how to ask more questions. I can't remember what else it said it had added AI to as I somewhat dismissed the starting banner a bit quickly. I will investigate further time/memory permitting.
Anyone else played with these new features?
I’ve given it a go.
The accuracy isn’t anywhere close to Be My AI yet but it’s great there is competition. The more options there are, the quicker we’ll hopefully have a FaceTime call with an AI which is what I’m excited for. An AI with access to my camera, location data, mapping and a conversational interface. It’s coming but who knows when.
Yes.
An AI with map access would be amazing. It could really help us.
Folow up questions
From the documentation that I've read, you can't yet ask follow up questions of pictures. You can, however, ask follow up questions about a document scan. For example, if you scan a menu, you can presumably ask"tell me about the vegetarian options".
Surprising that the app doesn't have the same capabilities as Be My eyes since Microsoft uses Open Ai's chat GPT. Maybe they are just rolling out features slowly.
--Pete
Very Cool...
I just updated and took a picture of a scene. It showed my bookshelf and said it also saw a blue yoga matt. A neighbor previously informed me that my yoga matt was blue, but it was good to get that verification. All these OCR options that are coming out really make for some fun and inclusive times!
AI Precious Moments
I am so grateful for apps like Seeing AI and Be My AI, and the future of AI in general. Last year I had to retire my Seeing Eye Dog. He is getting old, and has developed acute arthritis in his hips.
Without giving too much personal info, I live in a high-rise apartment building, and unfortunately he could no longer do stairs. Of course we have elevators, but I like stairs.
I'm dirty like that.
Anyway, I have a number of photos of him on my iPhone, dating all the way back to 2016, and now I am finally able to get some really good detail from said photos, ya know, whenever I feel like reminiscing.
I still get to visit my dog from time to time, I adopted him out to a family with 3 teenaged girls, who spoil him rotten. So I am looking forward to new photos as well. 😁
Now Available On Android!
https://accessibleandroid.com/app/seeing-ai/
or for those who wish to grab it directly:
https://play.google.com/store/apps/details?id=com.microsoft.seeingai
In short, acts similar to its IOS counterpart, though instead of using custom actions, Microsoft goes the tabs route found among said main screen (or switch between them via two-finger swipe left/right). Though someone could request custom actions via the 'Feedback' option found in the menu in the top left corner of the window I suppose, where ya can also find Settings.
All elements labeled, all features available (that aren't Lidar exclusive). You are also able to access features directly via Context menu from home/app draw, or by assigning said shortcut)s to a button/fingerprint action depending on the OS software used (such as Routines Plus via Samsung 1UI among the Goodlock suite of modules or Macrodroid for any other phone).
If you're wanting to use Seeing AI on an image that's in focus, long-press on said image and choose the 'share' item (or Share via). If the Seeing AI option is not among the main share Sheet area, tap More>Edit and add Seeing AI to said space.
Enjoy!
That's amazing.
I honestly never thought it would be ported over but that's great!
I'm an Iphone user but there's so many apps on android that are on IOS these days that if I wanted to; i could probably switch with no issues.
iOS apps & Android apps
Today I had the privilege of playing around with a friends Pixel 8 Pro. I won't go into too much detail, because the overlords of AppleVis would smite me. 😱
However, I will say that, while it is pretty cool Seeing AI is on the Google Play Store, it is even cooler that as of Dec 4, Be My Eyes has started to go live with Be My AI for Android. 🤠
//Excerpt from an email from the CEO of Be My Eyes
Hi Brian,
We are thrilled to announce that Be My AI will soon be available to hundreds of thousands of Be My Eyes users on Android devices worldwide!
Starting from December 4th, Be My AI will be available for hundreds of thousands of Be My Eyes users worldwide. The full roll-out will take a few days, so be sure to keep your app updated so you will have access to Be My AI as soon as it is available to you.
//End excerpt
Re: Very Cool!
Though I've never used Android before, I have some friends who are on that platform. They are all fully sighted, but I think they could benefit from this. I'm thinking of one friend in particular, who might not actually use Android but I'm pretty sure he does. I'm referring to my personal assistant, who up until rather recently was my life-skills tutor. He's in a similar role now, but that's not the point of my reply. If he gets his hands on this, he'll no doubt get more of a taste of what it's like to be blind. Make no mistake: he's been great over the years. But he's not that great with technology and has even admitted that.
Spotlight with Saqib from Seeing AI by Microsoft
From episode 157 of the Blind Android Users Podcast (includes Android vs IOS specific app questions discussed among said crew mates):
https://www.youtube.com/watch?v=DlQcCc3TwL8
Seeing AI for Android feedback can be sent to:
seeingai@microsoft.com
, or via the feedback item within said application.
GPT-4o image descriptions
It would be really nice if Seeing AI was updated with the GPT-4o image descriptions, namely that they are much faster. As it is now, sometimes AI descriptions even time out, which is a shame. It would be nice if asking follow-up questions were supported in more languages as well.
Sharing Broken in Latest Version?
Hi again. I'm wondering if anyone else has noticed that the sharing functionality seems to be broken in the latest version of this app. Just a bit ago I attempted to scan a bottle of organic chocolate milk given to me by a neighbor. Another neighbor helped me locate the bar code, and then the app did the rest. I was able to read the ingredients, nutritional info and an allergy warning. However, it seems I can no longer share this info via any of the methods given in the share sheet. Be my Eyes seems to have this same functionality now, and it works great.
New version
New version released today with, among other things, video descriptions, PDF descriptions and faster load times when recognizing with AI. Haven't had time to play with the new features yet but nice to see that this app is still getting updates.
Sweet!
I, too, am very glad this app still receives updates. 😎
Video Descriptions
Thanks for the kind words. This is Saqib from the Seeing AI team, and we're very much still working on pushing the boundaries of what's possible with Seeing AI. Love to hear how people are finding the video descriptions, and you can always reach out by emailing SeeingAI@Microsoft.com. And we keep going - plenty of exciting experiments in the lab! 😊
A few sample descriptions are in this tweet: https://twitter.com/saqibs/status/1852064334903677328
Re: Video Descriptions
Hi Saqib,
First things first, hats off to the seeing AI team for bringing this feature of Video description and PDF recognition.
Thank you for bringing this wonderful feature to seeing AI.
It is working beautifully. I would like to know more info about the video description feature. The quick help states that ten video descriptions can be generated for a day. Is there any limitation for the duration of the video?
My observations regarding the video description.:
I shared a twenty second video to seeing AI and I was amazed to hear the video description. The feature of first describing the video followed by playing that portion of the video clip enables us to quickly relate to that video and understand it.
I would like to know that for generating the video description of a twenty second clip the app took nearly four minutes, is this is normal, because I uploaded the video from a 330 MBPS connection.
I would also request the following features to enhance the experience in the video description feature.
1. An option to add the description to the caption of the video.
2. Read the description line by line with the screen reader.
3. Save/share the video with description.
4. Ask more detail information about a particular scene in the video.
5. Ability to ask the question before uploading the video, as many a time we would need only a specific info from that video instead of the entire video description.
Once again, I would like to appreciate the tremendous efforts by the Microsoft and the seeing AI team for actively maintaining and enhancing the app with new features and to top it all making it freely available to the visually impaired community.
I just tried the video-description feature on a photo in my libr
I just tried out the video description feature on a photo from my library, but it timed out and said to try back later. It also stated something to the effect of this video description will take some time to upload and that the quota will be reset at midnight. I'm going to do what it said and try back later, but are we eventually going to be charged a nominal fee for the descriptions that are generated? In any case, this is a cool feature. Thank you so much for keeping this app alive, and best of luck in the future.
Well...it is now later in the day and I just tried out a video. I don't think it was the one I originally landed on, but oh well. The video played, and even had music. Quite impressive! I think that was perhaps my first experience with TTS audio description, and I enjoyed it.
video description
The maximum length it'll describe is five minutes. I just tried importing a 10-minute video, and it told me the maximum duration is five minutes. For those who haven't tried this, check out the double tap demonstration of this, it's amazing. It's the latest episode, I can't find a link to it now.
I played around with us earlier today and
I played around with this earlier today and it is amazing. I was able to go back and look at videos from my wedding. Which meant a lot to me, especially now, Since my wife, unfortunately, has terminal cancer. The one thing I like very much is the fact that the audio descriptions don’t really interfere with the audio itself.
Re: video description
Hi Tara thank you for the update regarding the maximum time duration of the video.
Re: video description
I'm really impressed with this. I tried a video of my dogs where one was happily running about honking a sprout whilst the other one tried to video bomb me. The thing I liked about it was that I got all the noise of what was going on and short but useful comments describing the bits I couldn't see. It was much closer to audio description than anything I've tried before.
I sent the same video to PiccyBot and it gave me much fuller descriptions of what was happening, but it was much more like someone reporting back to me about what was in the video rather than me watching the video itself.
I think the Seeing AI approach works best when the video has noise of its own that is as important as the visuals. Whereas the pixies are likely to give me something more useful when the audio is otherwise less interesting. But I don't think one approach is necessarily better or worse than the other.
What is amazing is that we have two totally different perspectives on the same thing. I really appreciate the efforts of both devs for giving us these invaluable tools.
And to echo Firefly's comments, they do unlock some really powerful personal things that have previously been unattainable by us. I'm really sorry to hear about your wife, Firefly - I'm sure I'm not the only one sending thoughts and prayers your way.
Careful, though.
I just heard on a Facebook comment that, videos may not exceed five minutes, and must be an MP4 format, otherwise it's simply won't work. Haven't tried it myself though, but I would like clarification on it.
Length and format
Yes, those limitations are there; plus, there's a limitation of 10 videos per person, per day and as of now, it takes like 4-5 mins for even a short video of like 20 seconds to get processed (I haven't tried longer duration ones so cant say). Having said that, it's a start, and an impressive one at that. And things are only going to get better from here I guess...
Nice stuff
Unfortunately, the video descriptions are only in English for now.
Cat Video
It did well, and it's nice that it somehow pauses the video to speak.
I ended up going to Photos to find the video of my last cat eating its food a few years ago, then sharing it to Seeing AI. I couldn't remember which video it was, and Voice Over describes the first frame when swiping through the Recent album. Browsing from Seeing AI gives me the dates.
The sharing process seems to open Safari and uploading the video to a site. Just reporting how it worked for me.
I don't keep my bird videos on my phone, but I plan to watch a few of those. This could help to figure out if I've got what I want with some of these because I remember once the birds knocked the camera and tripod out of alignment while fighting over seeds, and all the action was off to one side of the frame. It would have been a really good video otherwise, and it took having someone view the video to figure out what happened. Pesky birds!
A small suggestion
Time outs shouldn't exist, unless the user clicks the cancel button themselves. Time outs only interferes with video uploading, and not everyone has fast Internet speeds.
So I was under the impression that there is a 10 video a day lim
So I was under the impression that when describing videos you can only do 10 a day. Which is fine with me. But I want to describe a video today and after I did that I went to describe another one, it said I had already reached my limit and I just did one video a day. Or one video today. Is this a bug? Or has the limit Been further restricted.