Reading with AI glasses, more than OCR!

By Unregistered User (not verified), 29 March, 2024

Forum
Assistive Technology

How will we read books in the future? If there are books in the future.

I don’t know about you, but I’m feeling 22! Oops, wrong site.

I don’t know about you, but I use Seeing AI for quickly ‘glancing’ at things with the ‘short text’ channel. I also use it to read documents, but never more than half-a-dozen pages or so. I haven’t scanned a book since 2012, when I said goodbye to my old friend, the ‘book edge’ scanner.

So what if the camera is on my face? I would want to be able to glance at things, read letters Etc. Would I want to read a book? Not just the blurb or the first page or two. Would I want to sit on the beach and read the latest block buster?

I think I might you know. Or at least it would be nice to know I could if I wanted to.

But this makes the question of how you tell your glasses what you are reading more complicated. Or does it. your glasses are just sending images to an AI, an AI that is clever enough to turn images of text into spoken words. If it is clever enough to do this, should it be clever enough to recognize you are reading a book? And if it knows you are reading a book, should it be able to see you have turned the page? These are all questions I am asking myself as I think about the idea of having literal reading glasses…and they are questions I hope the people designing these systems are asking themselves or their users.

What do you think? How smart do you want your smart glasses to be? If the AI sees you are reading something it can get online, should it tell you to ‘chill’ and just read the online version? Let me know what you think in the comments.

This is my first AI for Accessibility scouting report, let me know if you want more.

Options

Comments

By Tara on Sunday, March 31, 2024 - 16:40

Hi Lottie,
Reading printed material with minimal effort would be cool. Not having to use a flatbed scanner, or worse still positioning your phone's camera to make sure it captures all of the page. This is why I never scan printed books if I can help it. These days, if I can't get a book in electronic format, I don't bother. I had to scan one print book for my job, and it was a nightmare just because it was taking up a lot of my time, even with a flatbed scanner. At the time I couldn't get an online version, this book was about 800 pages long. It wasn't available on Kindle at all when I first needed it. But about 700 pages into the scanning process, I checked on Kindle books and it was there. If only I had waited, but I needed the book urgently. Flatbed scanners are pretty good, it's just an eBook is so much more convenient. This is why I'm interested in the Seleste glasses. I have some PDF image documents, and it would be nice to read with Seleste as opposed to battling with OCR, or having to send every page to ChatGpt. ChatGpt or Be My AI usually give good results.
Tara.

By OldBear on Sunday, March 31, 2024 - 16:40

That was a tormented time of my life, and I didn't completely know it at the time.
I know sighted people who need to use reading glasses. Putting on a pair of glasses to read is socially normal, and doesn't cause whispers and hushed, speculative gossip about what that person is putting on their face and doing.
I would want the AI to not be fussy about whether the page of the book is aligned and has all corners of the page visible before it takes its picture. Shut up and zoom out a little. Although, telling me the book is upside down would be socially useful, but refusing to read anything because of it is not useful.
I use the short text channel of Seeing AI a lot with mail and labels while holding both the phone and the item being looked at, however, it can be difficult at times to hold everything steady enough to get sensible information. I end up pulling the phone away, the equivalent of a glance, so that the AI doesn't interrupt itself from a slight movement. I can only imagine the same feature with glasses would be jabbering from the slightest head movement. So what ever it takes to deal with that; image stabilization maybe.
But wouldn't the true experience to aspire to be to have a better reading experience than someone having to root around for glasses, then put them on to read.

By OldBear on Sunday, March 31, 2024 - 16:40

I think one of my points, if I had any, was that people's actual eyes make automatic, physical adjustments for head movements, or any movements in the text within reason, and the AI needs to do this in a software kind of way. Thinking about it. people's retinas take many pictures as they refresh their chemicals, and the brain-software combines them smoothly as it filters out all the blind spots and internal, eye obstructions.
I actually do wear a type of glasses a lot. The safety glasses that protect against flying particles. I forget they are on after a while because they are lightweight. That's why I grab them most times, instead of goggles, which offer way, way more protection, or the liftable face shield that can get in the way.
Apple would prefer we use their extremely expensive goggles right now, that have to be gingerly picked up, lest they fall apart, and have a battery the size of an iPhone. Just you try to forget those are on your face!

By Travis Roth on Sunday, March 31, 2024 - 16:40

This is not my original idea. Someone once suggested and it works well for me: when possible I set the iPhone down and hold the document or object above the phone for Seeing AI to scan. In theory it should make no difference as my hand is no more or less steady but something to do with the camera being more stable or possibly better lighting in this configuration usually gets me better results.

By Andy Lane on Sunday, March 31, 2024 - 16:40

Thats a great idea.and I wish I’d thought of it. It makes so much sense. When you hold the camera, you are jiggling it in all 3 axies as well as moving it slightly in all 3 directions. Even your heartbeat makes slight movements. To imagine why this is a problem. Imagine holding a very very lightweight 50 foot pole. A very slight movement made by your hand will move the other end of the pole a huge distance so if you were trying to write your name on a peace of paper at that distance, it would be impossible because the distance amplifies movement but if you went to the other end of the pole that has the pen attached and moved the paper, you’d have much much finer control because the movements aren’t amplified across that distance. The same applies for the cameras view. I’m going to try that. Thanks.

By Andy Lane on Sunday, March 31, 2024 - 16:40

A way to control for poor quality scans would be to take multiple photo’s. This would have to be automated but if there were say 5 - 10 photo’s taken as you move around the page, your hand becomes more or less steady. In the collection of photo’s its likely different parts would have better clarity. Merging those OCR results seems trivial now we have AI. Maybe Seeing AI or someone will build this into their scanning modes. It should work really well.

By mr grieves on Sunday, March 31, 2024 - 16:40

I think for most books I'd either go with an audio book or not bother at all. At least for anything I'd read in my leisure.

However, I can see that this sort of thing could be good in some cases. Either if a book is never going to be available electronically because it's too niche or for some more personal things. For example, old diaries, school notebooks, report cards etc. I used to write all sorts of rubbish when I was supposed to be learning at school. I guess I could send them to Stephen Fry and see if he is interested but I'm guessing not. Also I think it could come into its own at some point with photos. For example many years ago my Mum gave me a kind of "this is your life" book filled with photos or my life up to that point. It's of no use to me now, but I can maybe see sometime in the future I could flip through it and have something interpret the photos in a meaningful way.

I think AI may be most useful when it's doing more than OCR but bringing in a few things together. Another example - I used to read lots of technical books as I'm a software developer. But since going blind I absolutely hate it and have hardly done any of that sort of thing. I guess better voices would help, but maybe giving me a little more control over how I consume the info. For example, maybe that means being able to read code in a better way - so not necessarily left bracket right bracket equals greater than etc etc which very quickly seizes up a certain part of my brain. We could possibly get very customised readings of books so it knows that I like code to be read out like this, but I like the rest of the text read out in a different way. Or being able to summarise or help me refer back and forwards but without a horribly complicated UI.

Sorry I have gone off on a slight tangent. But I would think if OCR was quick and efficient enough then glasses would be a much more pleasant experience than using the phone. However, there are apps that can scan pages in batches. I think the awkwardly named VD Scan was good at that but I am probably too lazy to sit in front of a book taking photos of it with my phone. I'm also worried my wife will find an app called VD Scan on my phone and worry about my extra curricular activities!

By Brad on Sunday, March 31, 2024 - 16:40

That lottie is talking about the stuff on graphicaudio.net but then again I've heard of graphic novels outside of that site so am not sure.

By Tom on Sunday, March 31, 2024 - 16:40

I have a scanner which I use maybe once a year, because I can find most things electronically. However, there is one use case where I use my phone but it is rather inefficient and that is reading printed materials outside of the house. I think glasses could make it more efficient.

For that matter, I am finding that most restaurant menus are available online, but when I go to conferences, it would be extremely useful to read a booth sign, or pick up a brochure and read it.

Another scenario is using a library. No matter how many books are available electronically, I think visually it is more efficient to go through a shelf, read titles, read the table of contents of books you are interested in, and find information anywhere in the book. Kindle only lets me download the beginning of the book, many books are not available in languages other than English, and the library has a topical organization that no online store could replicate, which can be useful for research.

So, would I change my reading habits? Probably not really, but I would add other way of obtaining information from books that I didn't have before.

I'm still trying to do this with my phone when it is absolutely necessary, but having an extra free hand could make a huge difference.

By mr grieves on Sunday, March 31, 2024 - 16:40

Thanks. Those production values seem very good. I will have to explore some more.