Seeing AI has got more AI

By Dave Nason, 14 October, 2023

Member of the AppleVis Editorial Team

Forum
iOS and iPadOS

So I got an update to Seeing AI in the App Store this morning, and it seems that some of the Chat GPT smarts have now come to the app. I was wondering when this would happen.
Firstly, the Scene detection mode now kind of works like Be My AI, although not as good, yet anyway, in my opinion. When you take the picture, it initially still gives you the not very detailed description we had previously. This is almost instantaneous. You then need to select a More Info button in order to get the new detailed AI description. Unlike Be My AI however, it seems you cannot ask any follow up questions or anything, which makes it far less useful than Be My AI. Surely this will be added at some point. On the plus side, you can browse your photo library.
The other enhancement is in the Document scanner, where they have now added the ability to ask it questions about the contents of what you scanned, which could be good.
Interested to hear what others think of it.
Dave

Options

Comments

By Laszlo on Friday, October 27, 2023 - 04:15

As I am Hungarian, of course my phone is set to Hungarian and has the region set to Hungary. This update was also displayed for me today, and I was happy to see the feature list of this quite significant update, whichhave already been translated to Hungarian, and were shown in the appstore like that. However when I opened the already updated app, a dialog appeared with the recent changes (of course in Hungarian), but it mentioned only the multipage possibility with the "documents" channel, the bringinf of the "scene" channel more to the front of the channels list, and the bugfixes under the hood. There were no mention of the new, far more detailed image descriptions, nor the follow-up questions possibility for the "documents" channel. When I took the updated app for a test drive, I could find no "more info" button equivalent for the images, I got only the recognised text and the short description, and there were the "explore photo" and "close" buttons, which worked exactly as before, nothing more. Even the "scene" channel did not change its place in the channels list, it was exactly where it had been before the update. I haven't tried the "documents" channels yet, but I suppose there won't be any questions option for me as for now. Setting the recognition language to English (the only language option in the settings menu of Seeing AI) did not chhange anything in this respect. So all in all, in Hungarian the app looks and behaves pretty much the same as before the update. I closed and reopened the app etc. but it didn't change.
Edit: I confirmed this with a simple method. I set the language of my phone to English, and the "More info" button and with that the new rich descriptions were available at once. As "phone language" is a general setting, so it affects all apps, this is far from an ideal solution in my case, but I could at least taste the new rich descriptions. I've contacted support about this.
Edit 2.: Seeing AI support has just confirmed that the new AI features are only available in English for now. More languages may be added later.
Remark: this tells me the new AI features of Seeing AI are not hooked up to a GPT-family language model (as Be My AI is), but to some other type, because if it were, then it would "speak" Hungarian very well - as Be My AI does. The applied language constructs and wording also tell me that Seeing AI applies some other type of language model, different from Be My AI.

By Brad on Friday, October 27, 2023 - 04:15

This is only going to get better and better over the years :)

Kids will grow up thinking this is normal and that's amazing for them.

By Andy Lane on Friday, October 27, 2023 - 04:15

Try it out. You take a photo then Seeing AI labels things it recognises in the photo when you hit the explore photo button. You can then move your finger around the screen and VO will tell you when you find an item its labelled. It’s incredibly useful for getting an idea of whats around you and where things are in relation to each other. I can see myself using this a lot. Thanks so much for this great feature MS.

By Dave Nason on Friday, October 27, 2023 - 04:15

Member of the AppleVis Editorial Team

Yep the explore thing is not new. It’s been in the app for quite a while I think. And iOS also has a native feature for this.
Dave

By OldBear on Friday, October 27, 2023 - 04:15

Yes, it's been there for a long time. I tried it once on my iPhone 7, and it not only didn't do anything, but it got stuck, or VO got stuck. I can't remember exactly, but I never tried it again.
Sounds like the Chat GPT stuff is going to be there when you recognize a photo with Seeing AI through the share sheet thing, like in the camera app. If the Explore thing works without crashing VO, that will be nice for checking that I got the shot I want.

By Michael on Friday, October 27, 2023 - 04:15

what is interesting here is that the AI more info in seeing AI is nicely describing photos be my AI would deem as inappropriate or going against community guidelines.

By Andy Lane on Friday, October 27, 2023 - 04:15

Hi, thats great, would you be able to give examples of what Seeing AI will describe please? It may be a good way to advocate with Open AI about Be My AI’s current restrictions if theres a difference between one and the other.

By miguel3025 on Friday, October 27, 2023 - 04:15

Hello.
I've been trying out this new feature, and I think that be my AI is much better, mainly because it hallucinates a lot, like saying I have dogs and cats in the room when I don't actually have any pets.
The only positive point of this compared to be my AI is that it's sometimes much faster at providing results.
Best regards.

By Andy Lane on Friday, October 27, 2023 - 04:15

It’s great, there’s lots of competition, but this one just isn’t up to it yet. Hopefully with more development though.

By emassey on Friday, October 27, 2023 - 04:15

I have been experimenting with the new AI in the scene detection mode, and I have found that Be My AI is better describing the scene as a whole and how things in the image relate to each other, and at extrapolating further information not explicitly shown, but Seeing AI sometimes gives more details than Be My AI does before asking it any questions. The processing time after pressing the "More info" button in Seeing AI seems about the same as the processing time for Be My AI if not longer. Also, Seeing AI seems to be using its existing OCR technology as a part of this new AI, since often when it says there's text in the image the text contains the same extraneous punctuation and other mistakes that the OCR result contains. I think it might be partly using its existing captioning technology as well, since I often notice that phrases in the longer description are the exact same as those in the short caption, sometimes with no added detail. However, I think its very likely that it uses GPT4 to interpret the image for at least part of its analysis, since it does often have details not present in the short caption, although I could be wrong about this.

By emassey on Friday, October 27, 2023 - 04:15

I did a few experiments to find out if this feature has similar content restrictions to those in Be My AI, and the results were very promising but also disappointing. To test this more comprehensively, I played a few videos containing adult content and took screenshots at different points, then shared them with Seeing AI.

Be My AI always told me that it could not describe the image, saying that it may contain things blocked by its safety system, even for images that only seem a little explicit. Seeing AI never gave me a message saying it couldn't describe the image, and it actually described many of the images to the same level of accuracy it describes non-explicit images. However, it never describes the sexual aspect of the image or makes inferences that involve adult content. For example, it will often describe the outfit someone is wearing, sometimes in a pretty detailed manner, but it will never describe what parts of someone's body look like that are usually covered, not even saying that certain parts are not covered. Also, it described certain objects in terms of their shape and color without infering what they actually are. One time it also described a sexually explicit action as something completely different, but something that might look a little similar, and it stripped the sexual context from other actions. Furthermore, when the images became a lot more sexually explicit, the app would make the processing sound for a similar amount of time, but after that the results screen would only have the "Back" and "Share" buttons with no description underneath, and when I tried to tap the screen in different places or use "Vertical navigation" in the rotor, VoiceOver would become unresponsive for a while until focus jumped back to one of those buttons or the status bar. I'm thinking that the AI is rejecting these images and there's a bug that causes this behavior instead of showing a message.

I hope this was useful and didn't offend anyone! It seems that Seeing AI doesn't have the safety filter that Be My AI does, or at least doesn't apply it to this kind of content, but the AI still has limitations on what it can describe, perhaps because GPT4 itself was trained not to interpret such content. I think this is still a step in the right direction though!

By emassey on Friday, October 27, 2023 - 04:15

I forgot to mention that sometimes it processes for a while and then says "Sorry, the request timed out." I'm guessing this is because a lot of people are using it, but right now I'm getting this error every time, so hopefully they can increase the computing power or the bandwidth divoted to this so more people are able to use it at the same time.

By SSWFTW on Friday, October 27, 2023 - 04:15

I absolutely love and am amazed by this feature. I am wondering is there a similarly accessible application that will allow me to upload the document and allow me to make queries about the text. This would just allow me to search through much larger documents rather than having to scan them by hand first.

By Brad on Friday, October 27, 2023 - 04:15

I think you can copy a document into chat gpt 4 and have it annalise the text for you but you're going to have to pay for that and I don't know how many tokens you get and how much one word is in them.

By Brad on Friday, October 27, 2023 - 04:15

If you can, open the app, and point the camra at some text and it should start reading.

By Brad on Friday, October 27, 2023 - 04:15

there's a picker if you flick to the right a couple times, if you flick up you'll get to the document reader, and if you keep flicking you'll get to the diffirent options.

It really is a great tool if you can use it.

By Brian on Friday, October 27, 2023 - 04:15

Seeing AI used to have a Handwriting feature. Have not played around with that in a while, so not sure if it has improved any.

By kool_turk on Friday, October 27, 2023 - 04:15

There will hopefully be a device that you should be able to control entirely with your voice, but it won't be a phone.

Hopefully we'll know more on November 9.

I only hope it isn't a letdown, because they're really hyping this thing up.

By Ekaj on Friday, October 27, 2023 - 04:15

Hi all. I've used most of these scanning apps with mixed results, but am super impressed with them. It seems that when I lived on the top floor in this building, I was less successful with my scans. I'm told this is because the lighting up there wasn't that great, but perhaps that changed. Of course I wasn't using Be My Eyes at the time, so that might make a difference. Having said all that, it seems I've had a bit more success in my current apartment. Just yesterday I attempted to scan a thing of deodorant, because I found a sticker on there and wanted to see if Be My Eyes and/or Seeing AI could detect it. As it turned out both apps were able to perform a successful scan of everything in my bathroom but the deodorant. I think these apps are amazing, and cannot wait to see what comes next. Granted I'm still on the iPhone 7, but am asking for a new phone for Christmas. My sincere and heartfelt thank you to the developers of these apps for bringing us so much.

By Brian on Friday, October 27, 2023 - 04:15

Seeing AI has the ability to read aloud handwritten text. Unless this recent update removed to, of course. You have to adjust the focus control att he bottom of the screen until you hear "Handwriting".

HTH.