Something different is coming: a progressive web app for you all.

By Stephen, 12 November, 2025

Forum
Assistive Technology

Something Different is Coming
A Progressive Web App for Blind and Visually Impaired Users | Works on All Smartphones

I need to tell you about something I built. Not because it's "the best" or "revolutionary" – everyone says that. But because it works in a way that genuinely surprised me when I tested it.

The Problem I Kept Running Into
You know the drill with most vision AI apps:
Point your camera → AI speaks a sentence → that's it.
"It's a living room with a couch and a table."
Cool. But where's the couch exactly? What color? How far? What else is there? Can you tell me about that corner again?
You have to point again. Ask again. Wait again. Listen again.
You're always asking. The AI is always deciding what matters. You never get to just... explore.

What If Photos Worked Like Books?
Stay with me here.
When someone reads you a book, you can say "wait, go back." You can ask them to re-read that paragraph. You can spend five minutes on one page if you want. You control the pace of information.
But photos? Someone gives you one description and that's it. Take it or leave it. They decided what's important. They decided what to mention. They decided when you're done.
We thought: What if photos worked like books?
What if you could explore them at your own pace? Go back to parts that interest you? Discover details the other person missed? Spend as long as you want?

The 6×6 Grid: Your Photo, Your Exploration
Here's what we built:
Upload any photo. Any photo at all.
The AI divides it into 36 zones – a 6×6 grid covering every inch of the image.
Now drag your finger across your phone screen like you're reading a tactile graphic.
What This Actually Feels Like:
You're exploring a photo of your living room:
Start in the top-left corner – drag your finger there:
"Smooth cream-colored wall with matte finish, cool to imagine touching, painted evenly"
Slide your finger right:
"Large window with soft natural light streaming through, sheer white curtains that would feel delicate and silky between your fingers"
Down a bit:
"Polished oak coffee table, glossy surface that would feel smooth and slightly cool, rich honey-brown color"
To the left:
"Plush beige carpet, deep pile that looks like it would feel soft and springy underfoot, slightly worn in the center from foot traffic"
Wait, go back to that window – drag back up:
"Large window with soft natural light streaming through, sheer white curtains..."
You're in control. You decide what to explore. You decide how long to spend. You decide what matters.
Go to the bottom-right corner – what's there?
"Wooden bookshelf against the wall, dark walnut finish with visible grain, would feel smooth with slight ridges"
Move to the zone right above it:
"Books lined up on shelf, various colored spines, some leather-bound that would feel textured and aged"
This Changes Everything
You're not being told about the photo.
You're exploring it.
You can go back to that window five times if you want. You can ignore the couch and focus on the corner. You can trace the room's perimeter. You can jump around randomly.
It's your photo. You explore it your way.
And here's the thing: the information doesn't disappear. It's not one-and-done. It stays there, explorable, for as long as you want.

Now Take That Same Idea and Put It in Physical Space
You walk into a hotel room at midnight. You're exhausted. Strange space. No idea where anything is.
Usually? You either stumble around carefully, or ask someone to walk you through, or just... deal with it till morning.
New option:
Point your camera. Capture one frame. The AI maps it into a 4×4 grid.
Now drag your finger across your screen:
• Top-left: "Window ahead 9 feet with heavy curtains"
• Slide right: "Clear wall space"
• Keep going: "Closet with sliding doors 8 feet on the right"
• Bottom-left: "Clear floor space"
• Center-bottom: "Bed directly ahead 5 feet, queen size"
• Bottom-right: "Nightstand right side 4 feet with lamp and alarm clock"
You just mapped the entire room in 30 seconds. Without taking a step. Without asking someone. Without turning on any lights.
Want to know what's on the left side again? Drag your finger back over there. Want to double-check the right? Drag there.
The information stays right there on your screen. You can reference it. You can re-explore it. You can take your time understanding the space.

The Core Difference
Most apps: Point → Wait → AI decides what to tell you → Move on → Repeat
This app: Explore → Control the pace → Discover what matters to YOU → Information persists → Return anytime
That's not a small difference. That's a fundamentally different interaction model.
You're Not a Passive Receiver
You're an active explorer.
You don't wait for the AI to decide what's important in a photo. You decide which zone to explore.
You don't lose the room layout the moment it's spoken. It stays mapped on your screen.
You don't get one chance to understand. You can explore as long as you want, go back, re-check.
This is what "accessible" should actually mean: Not just access to information, but control over how you receive and interact with it.
I have big plans for this feature to expand it as well.

Oh Right, It Also Does All The Normal Stuff
Because yeah, sometimes you just need quick answers.
Live Camera Scanning
Point anywhere, AI describes continuously:
• Quiet Mode: Only speaks for important stuff (people, obstacles, hazards)
• Detailed Mode: Rich ongoing descriptions
• Scans every 2-4 seconds
• Remembers what it already said (no repetition)
Voice Questions - Just Ask
No buttons. Just speak:
• "What am I holding?"
• "What color is this shirt?"
• "Read this label"
• "Is the stove on?"
• "Describe what you see"
• "What's on my plate?"
Always listening mode – ready when you are.
Smart Search (Alpha)
"Find my keys"
AI scans rapidly and guides you:
• "Not visible – turn camera left"
• "Turn right, scan the table"
• "FOUND! On counter, left side, about 2 feet away"
⚠️ Alpha: Still being worked on.
Face Recognition: Alpha
Save photos of people → AI announces when seen:
"I see Sarah ahead, about 8 feet away"
Totally optional. Enable only if wanted.
Object Tracking: Alpha
Tell AI to watch for items:
"Keep an eye out for my phone"
Later: "Where did you last see my phone?"
→ "On kitchen counter, 22 minutes ago"
Meal Assistance
Food positioned using clock face:
"Steak at 3 o'clock, potatoes at 9 o'clock, broccoli at 12 o'clock"
Plus descriptions: portion sizes, cooking level, colors, textures.
Reading Mode: Alpha
Books and documents:
• Voice commands: "Next page", "Previous page", "Repeat", "Read left page", "Read right page"
• Speed controls: "Read faster" / "Read slower" (instant adjustment)
• "Check alignment" (ensures full page visible)
• Auto-saves progress per book
• Resume exactly where you stopped
Social Cue Detection: Alpha
Optional feature detecting if people are:
• Making eye contact with you
• Waving or gesturing toward you
• Trying to get your attention
Fully Customizable
Pre-set profiles or build your own:
• Scanning frequency (2-4 seconds)
• Detail level (Basic / Standard / Maximum)
• Voice speed (0.5× to 2×)
• Auto-announce settings
• Feature toggles

Why This is a Web App, Not an App Store App
Honest reason: We want to ship features fast, not wait weeks for approval.
Better reason:
App stores are gatekeepers. Submit update → wait 1-2 weeks → maybe get approved → maybe get rejected for arbitrary reasons → users manually update → some users stuck on old versions for months.
Progressive Web Apps are different:
Bug discovered? Fixed within hours. Everyone has it immediately.
New feature ready? Live for everyone instantly.
AI model improved? Benefits everyone right away.
No approval process. No waiting. No gatekeepers.
Plus it works everywhere:
• iPhone ✓
• Android ✓
• Samsung ✓
• Google Pixel ✓
• Any modern smartphone ✓
Same features. Same performance. Same instant updates.
Installation takes 15 seconds:
1. Open browser
2. Visit URL
3. Tap "Add to Home Screen"
4. Appears like regular app
Done.

Privacy (The Short Version)
• Camera images analyzed and discarded – not stored
• Voice processed only during active questions
• Face recognition optional
• Data encrypted
• Delete everything anytime
Critical Safety Disclaimer:
AI makes mistakes. This is NOT a replacement for your cane, guide dog, or O&M training. Never rely on this alone for safety decisions. It's supplementary information, not primary navigation.

When Does This Launch?
Soon.
Final testing in progress.
When we officially release, you will have all features even though some of the app and it's features will still be in beta.
The Real Point of All This
For years, accessibility apps have operated on this assumption:
"Blind people need information. we'll give it to them efficiently."
Fine. But also... what if I flipped it:
"Blind people want to explore. They want control. They want information that persists. They want to discover things their way."
That's what I built.
Not "here's a sentence about your photo" but "here's 36 zones you can explore for as long as you want."
Not "here's a description of this room" but "here's a touchable map that stays on your screen."
Information that persists. Exploration you control. Interaction you direct.
That's the difference.

One Last Thing
The photo grid gives you 36 descriptions per image. Detailed, sensory, rich descriptions.
So when it comes out, watch people explore single photos for 5-10 minutes.
Going back to corners. Discovering details. Building mental images. Creating memories of the image.
That's not just making photos accessible.
That's making photos explorable.
And I think that's better.

Coming Soon
Progressive Web App
Works on All Smartphones
Built for exploration, not just description

What do you think? Which feature interests you most? Questions? Thoughts? Comments below.

Options

Comments

By Trenton Matthews on Wednesday, November 12, 2025 - 17:19

All of the above!

The fact more apps from an accessibility standpoint (me being an Android user) aren't web apps, your idea is a golden glass of fresh air here!

By Stephen on Wednesday, November 12, 2025 - 17:21

Thank you so much. I did get very little sleep when getting the foundation built lol.

By Stephen on Wednesday, November 12, 2025 - 18:14

I may be looking for beta testers in the near future more specifically for the android side. I have an iPhone myself so I can test the iPhone features so let me know if this interests you 😊.

By Brian on Wednesday, November 12, 2025 - 18:59

Making this a universally accessible application is boss! I for one cannot wait to give it a try. 😎👍

By Rui Fontes on Wednesday, November 12, 2025 - 19:00

Good job!
I am ready to beta test it!

By Stephen on Wednesday, November 12, 2025 - 19:40

Thanks so much. Right now I’m just building the integration for its own self screen reader. That way you can just turn off their screen reader when using the app itself. Using it myself I don’t like that I have my screen reader going plus the screen reader in the app also going. I’m also working on when feeling through your photos, you can actually tap on an item and it’ll expand that item so you can explore that item like a bookshelf for example.

By OldBear on Wednesday, November 12, 2025 - 19:45

Stephen, it sounds like a good way to do it. I've been having to ask questions of AI in a grid, or to be exact, several different kinds of grids such as thirds, to understand pictures I take and edit.
An issue I have with another explore by touch AI app is that there is no indication of where the image ends at the top and bottom of the iPhone screen, and as I work with several different aspect frames with lots of blue sky that the AI does not speak, I have to guess a lot. I would think this would not be an issue in your grid system.
I find myself constantly having to ask if the edge of the image cuts part of a bird or other critter off that the AI has said is in the picture, they almost never say this up front. I also have to ask a lot if something is in focus with one of my AI describers.

By Enes Deniz on Wednesday, November 12, 2025 - 20:09

This reminds me of an app named Image-Explorer, but it no longer works even though it is still available in the App Store. I will be looking forward to this cross-platform app and it'd be interesting if it also worked on touchscreen laptops.
@OldBear, what's this other explore-by-touch AI app you mentioned, if I may ask?

By Stephen on Wednesday, November 12, 2025 - 20:26

It should work wherever a touchscreen is implemented. But it would be interesting to see if it actually functions the way it’s supposed to, but so far everything is functioning extremely well. I remember image explorer… It was a pretty weak app. I can also adjust things for you guys on the fly if something seems broken or buggy or you want it to react differently.

By Enes Deniz on Wednesday, November 12, 2025 - 20:39

You can't add the option to provide audio feedback as the user moves the finger around the screen, right? You know, the type and timbre, volume and other characteristics may indicate certain visual properties. I also thought of 3-D audio or spoken and perhaps even haptic feedback but those might be more challenging to implement. I acknowledge that audio feedback requires the app to treat the image as a whole as the user should hear continuous beeps or loops or blips or whatever as (s)he moves the finger across the screen and as colors shift and the level of brightness fluctuates etc. so you might need to develop a new underlying approach to redesign the interface for this to work properly.

By Stephen on Wednesday, November 12, 2025 - 20:57

I’m working on haptic features for different textures etc etc but that’s gonna be really hard to get going I think it being a web app and all. But the descriptions, for example when I upload a photo of my dog, I can feel where his ears are, his nose is, his eyes, the voice also gives the expression of his eyes and what his nose looks like. I really wanna try to give everyone an actual experience with a photo and not just hearing an AI‘s description of the entire photo. You should be able to explore it… I also am implementing a Zoom feature where if you double tap on a certain portion of let’s say the dogs nose, you can just explore his entire nose. It does work better with bookshelves but right now I only have pictures of my dog lol. I also have a bunch of sunrise and sunset photos and it’s really cool being able to go through and really feel the sky.

By OldBear on Wednesday, November 12, 2025 - 21:13

I was talking about the Seeing AI app, in Descriptions>Brows Photos, or something like that. There's an Explore option that gives haptic and audio feedback on some of the larger objects in a photo if the process recognizes them.

By Brian on Wednesday, November 12, 2025 - 21:40

Will this have Braille support, for those persons whom are both deaf and blind?

By Enes Deniz on Wednesday, November 12, 2025 - 21:47

So is it possible to add the option to zoom in or out and let the user explore by smaller or larger units/distances, even pixels? You know, this will be quite handy if you somehow implement audio cues/beeps. So what I'm talking about is something like a combination of different methods usable simultaneously. Let's say you're exploring a photo featuring a person. You'd get more detailed audio feedback as you slide your finger, but only when your finger moved over a different body part or clothing would you get spoken feedback. So this will require that the app detect individual objects and describe them only while your finger is on them, by taking into account the size and location of each object, rather than dividing each and every image into the same number of zones and treating every image as a grid. One object may span multiple zones on that artificial grid, or it might be so small that it fits within one zone, so that system unfortunately didn't sound so realistic and effective to me. The alternate method I'm proposing is more like that found in Image-Explorer in that respect. The audio cues should also be heard more naturally and continuously, so representing an image as a grid may prevent that. Let me try a different explanation to clarify my point further: Exploring an image represented as a grid sounds like navigating a table with a certain number of columns and rows. So it's more of jumping from one cell to an adjacent one as you slide your finger than exploring the entire image as a whole.

By Stephen on Wednesday, November 12, 2025 - 21:57

I am working to see if I can implement your suggestions right now 😊. Standby.

By Enes Deniz on Wednesday, November 12, 2025 - 22:06

Well, apparently this is where your app will excel. Whenever we have a suggestion, bug report etc., we just fire away and you take care of everything without ever dealing with app store policies, having to submit your updates and wait for them to be approved.

By Stephen on Wednesday, November 12, 2025 - 22:21

Implementation successful. I’m sure it could be better but it’s one heck of a start!

By Stephen on Wednesday, November 12, 2025 - 22:29

Imagine a blind person who's
never "seen" their child's face as an example. You can now Feel the shape of their nose, Count their teeth, Explore their smile lines etc etc. it’s pretty tough to feel the exact size of Let’s say an adult on a screen, but I’m hoping the Zoom feature can help with that as well a little. The Zoom feature is playing a little bit hard to get, but I’ll get it.

By Stephen on Wednesday, November 12, 2025 - 22:42

I’m not against adding braille support at all. The problem with that is going to be what display they’re using and whether or not I can get it to work on displays. I’m not sure how I can implement that effectively where they can entirely feel the braille that’s representing ears, nose, eyes etc etc plus with all the zoom features. I would be curious to know if web apps are just accessible for braille display users anyways?

By Enes Deniz on Wednesday, November 12, 2025 - 22:55

Now that I've left you to deal with my volley of suggestions, I'm now beginning to think of how many different scenarios in which this app would be highly useful, from exploring the world map to taking or finding a photo of a street to get a better overview for easier navigation or examining a photo taken by a friend and posted on social media in detail.

By Stephen on Wednesday, November 12, 2025 - 23:01

lol. I have so many ideas for this app and for you guys.

By OldBear on Wednesday, November 12, 2025 - 23:12

That's great.
What Enes Deniz describes, and I guess is now implemented, is much like what the Seeing AI Explore feature is, though that is very limited.
I use it for when, for example, I am cropping a picture of a bird with its wings spread to make a photo printout on a specific size paper, and I need to be sure the bird is large enough and in the desired spot without it being cropped by the edges. I locate the bird, say it's in portrate orientation, and run my finger across the screen over and over until I have a good idea of how it is relative to the sides of the picture. Doesn't work with top and bottom as well, but I can at least tell if it is in the top half or bottom half.
Having more specific details included in that would be a game changer.

By Stephen on Wednesday, November 12, 2025 - 23:33

You tell me what you need and I’ll do my best to make it happen 😊.

By Karok on Thursday, November 13, 2025 - 01:37

this sounds amazing but remember we need our imaginations to "see" an arm, "see" a plant, "see" what we are doing say, counting teeth, it will require people to have exceptional spacial awareness, imagine a day which will never come in my lifetime where you can "physiclly" somehow? feel a photo. remember we are examining the screen which is fabulous, but it will require people i guess, to imagine they are "in" the photo, does that make sense?
so, if say you have a picture of a dog, in a living-room you'd need I guess to imagine you are in that living-room physically to "feel" how the photo looks. I look forward to trialing this, will it be this year I wonder?
I would have liked to explore my mums house decorations, in particular her Christmas tree it huge apparently lol. it will be great to get a "feel" for my childrens faces as well, yes I can touch them but it will be great to get a feel for it.

I love the idea of the food as well, say if we are in a restaurant if it can say your steak is at 2 o'clock your fries/chips in the UK, chips I mean are say 6 o'clock, for those who value that, it will be fabulous.

By Stephen on Thursday, November 13, 2025 - 01:47

I hear ya. With the way I have it set up. You can feel the entire room. I’m also working on those sound cues that you can trace so you can feel how big an object is, you can also zoom in on a specific object and just explore that object. So let’s take your mom‘s Christmas tree. When you take a photo of the room or she sends you a photo of the room, you can move your finger across the screen to find the Christmas tree. You can feel its shape and size as much as possible with sound cues, tap on the tree, then you’ll be able to feel the tree with all of the ornaments on the branches. Then, you can actually tap on each ornament and feel it thru sound and description. Right now I have two levels of Zoom programmed into it. I’m hoping to show it off within the next week or so… There may be a little delays due to me fixing bugs because unlike a lot of companies if I’m going to release something, I wanna make sure it at least functions decently lol.

By Exodia on Thursday, November 13, 2025 - 02:16

I have to say, I find this app that you’re talking about to be quite good. Could this thing help me identify menus on my Casio CTS 1000 V keyboard? I have trouble with the menus because there’s no speech or clicks or beeps or anything. I had to use ally to help me pair the Bluetooth connection so I could use the speakers as a sort of audio speaker to stream stuff from my phone to it. It would also be cool if he could read me what styles are on the display because this has no numbers either, it’s buttons a dial and more buttons. I have an idea as to what some of the buttons do. But the menus and learning what styles are what is a little bit difficult except for the pop styles and the rock styles

By Stephen on Thursday, November 13, 2025 - 04:52

Perhaps. Let me just finish getting the core features out at least and then I could work on lots of other features. The only two main ones I’m having problems with right now is location accuracy, search and well… I guess I haven’t tested the reading yet so I can’t say that’s a problem but everything else seems to be working pretty smoothly. Right now I’m just editing the finishing touches and trying to make that AI stay on point when you’re browsing through your photo.

By Stephen on Thursday, November 13, 2025 - 05:45

I’ve decided to open this up for a public alpha beta trial. I want to be upfront about something that matters. This project is expensive to build and keep running. It costs me close to three hundred Canadian dollars every month just to maintain everything behind the scenes. I cover it by working full time, which is fine for now, although it limits how fast I can push new features.

I want people to try it without barriers, so the alpha beta will stay public for a little while. It will not stay open forever because the costs add up quickly. I am exploring options for the future, whether that is donations or a small subscription model. I want to find something that works for everyone. If this takes off and the community shows real interest, I would look at reducing my work hours so I can put more time into development.

I appreciate everyone who tests this, gives feedback, or even shows curiosity. This community can be tough to impress and I mean rightfully so, I know I am, which is exactly why I want your honest reactions. And please don’t tell me it doesn’t know your correct location… I know. It’s a thorn in my side lol. Also the search function in the conversation mode doesn’t work quite yet…it’s something I’m working on. TBH, I got a little hyper focused on photo exploration lol. You can find the link below.
http://visionaiassistant.com

By Amir Soleimani on Thursday, November 13, 2025 - 09:17

I just tried the app, and it works really well especially as a first beta release! It describes the surroundings better than Gemini or ChatGPT. Thanks, Stephen, for your efforts. My observations so far on an iPhone 16 Pro Max:
1. Maybe I'm lost in the Settings window, but it seems that I can't alter the voice. It uses Samantha, and I want to use, say, Alex, Eloquence, or eSpeak-NG. I can't find a setting for that. The only thing I see about the voice is changing the speed.
2. The AI Assistance feature cannot be started unless the Camera mode is activated first. I don't know if this is by design, or a bug. I think the AI assistance should be started independently to avoid confusion.
3. If I enable Continuous scanning from the Settings window, which scans the environment every 4 seconds by default, the app doesn't provide responses in a meaningful or useful way. After enabling that from Settings and enabling the AI assistance feature from the main window which requires enabling the camera mode first, the app keeps beeping continuously. As I ask it something, it tries to provide an answer, but the answer gets cut off immediately. The same happens after the next question. So something weird should be happening there.

By Amir Soleimani on Thursday, November 13, 2025 - 09:46

Stephen, people on Mastodon who don't use AppleVis want to know if they can contact you via email for feedback and issues. Is such an option available?

By Brad on Thursday, November 13, 2025 - 10:31

I've tried the beta and It gets things write which is nice, but I turned on the live ai mode, the seccond live button, and it mentioned voice commands, I can't seam to finds them? I ask where are the whipes, get nothing back, what text is on the whipes, same thing,, also the room description things my room is empty when it isn't.

By Gokul on Thursday, November 13, 2025 - 11:51

Both the idea and the current implementation in my short initial test. I can totally see the potential and this can go places if put together properly and cleenly. I'd really like to see large-to-small maps being able to be described this way. Also, as you said, this community can be tough to impress so I'd be prepared to face the toughness as an inicial alpha is opened up.

By Stephen on Thursday, November 13, 2025 - 13:14

Hello Amir Soleimani. So some answers to your questions: no you can not change the voices. That will be something to look into down the road however right now this is the cheapest option. I could probably set up the 11 labs API, but that’s an extra cost on my end so for now it’ll just be your devices default browser speech. As for getting in touch do you prefer discord, Facebook, slack? Let me know and I’ll set up What’s most convenient for you 😊. As to that other pesky bug that seems to keep popping up where it keeps cutting itself off, I’m working on it now 😊.

By Amir Soleimani on Thursday, November 13, 2025 - 13:21

Thanks, Stephen.
1. As for speech, any chance of using other voices already available on the phone? It can be Alex, for instance, for iPhones. I mean phones provide access to a number of built-in voices not just one.
2. As for communication, I'm perfectly fine with AppleVis. However, guess people on Mastodon prefer an email address if doable for you.
3. And thanks for looking into the constant beep/ cutting off issue.

By Stephen on Thursday, November 13, 2025 - 13:31

In regards to the voice, unfortunately no. While your device may have other voices, it will only let me use the browsers default native voice. I can definitely look into other options in the future but right now for me. It’s the most cost-effective as I’m not paying anything on top of what I’m already paying. I’ll get a contact page set up for everyone today 😊.

By Devin Prater on Thursday, November 13, 2025 - 14:24

I think a good many blind people use Discord, so a Discord community could be a good discussion group for this.

By Stephen on Thursday, November 13, 2025 - 14:28

I just decided to implement a full on direct messaging chat feature in the app lol. I might also implement a community chat so everyone who’s using the app can talk to each other.

By Stephen on Thursday, November 13, 2025 - 14:35

Thanks. Using maps like this is probably my ultimate goal and I’ll get there most likely eventually lol. Right now this is quite literally the foundation but if people can interact with their photos this way, I don’t see why I couldn’t implement a Maps feature this way. Let me work on it… you have ideas in my head spinning now 😊. Another note, I get why the blind community is hard to impress… We keep getting offered things that don’t live up to the hype. That’s why my plan is to be realistic and transparent with the community I’m building at four. No marketing schemes, no video editing to make things seem like they’re working faster than they actually are and everything else not mentioned. Plus I’m easily accessible, and I can push updates fairly immediately. In a moment here I’m going to be pushing an update where you can just direct chat with me through the app.

By Stephen on Thursday, November 13, 2025 - 14:52

Just a couple quick updates, live ai SHOULD, in all caps, be giving better descriptions however, it may interrupt itself once or twice per description. I will be actively working on that throughout the day. Also in your settings, if I set it up properly, you should be able to DM me. You should see the contact developer button. It’s basically an instant messaging chat like iMessages or WhatsApp. I just thought it was easier that way seeing as you guys will already be in the app. If you get any error messages, please send your message with a screenshot if you can…yes they do help when trouble shooting. I’ll be monitoring this thread thru out the day incase the chat feature is broken so if you had tried to send a message and I didn’t respond, please let me know here. Thank you all for the great support. There’s so many things I want to do with this :).

By Devin Prater on Thursday, November 13, 2025 - 14:54

First, the web app works pretty well. I tried the room view and a photo. They both worked well, and I was surprised at how having a kind of tactile view of the photo helped me remember the photo even better. Fall leaves at the top left and right, text of the flyer in the middle. Stuff like that. I could imagine using this in video games where there's a game board. Of course, web apps can't really work on top of games but I could take a screen shot. I imagine the photo picker would let me get to all my screen shots.
Now an idea: I didn't know when I'd gotten to the last element of a photo, so maybe a boundary noise? when there are no more elements below where the finger is?

By Jesse Anderson on Thursday, November 13, 2025 - 14:56

First, I love the detailed description of the tool. It's clear a lot of thought was put into it. I never thought of this approach for describing images, but after the different scenarios you described, it makes sense, and makes me wonder why this hasn't been tried already.

Since this is a web app, I'm also wondering about a future possibility of using this on a desktop or laptop computer, and instead of capturing the camera, it would capture the screen. A user could use the mouse or touch screen in the same way as an IOS screen. This could be very helpful for exploring graphical detail in more detail. It could also be really helpful for exploring inaccessible content, like those stupid anti-virus install wizards that are rarely accessible, when I want to remove them. Once the user has identified a checkbox, button, etc., the user could then tap, double tap, or click/double click to activate an inaccessible control.

I'm looking forward to giving this a try, beta, or at release.

By Stephen on Thursday, November 13, 2025 - 15:10

You can already try it out. I did post the link above 😊. It was a bunch of messages back though so I’ll post it below this message. I love those ideas btw!! The reason why it’s a Web app is because I can push updates for you guys right away without having to deal with native App Store drama. There’s also a lot of limitations put on native abs. I decided to also do it this way because it’s universally accessible.
http://visionaiassistant.com

By Stephen on Thursday, November 13, 2025 - 15:16

Thanks so much for the feedback. Definitely something I’ll look into implementing. Now when you’re in photo explorer mode, if you find objects in that photo, you can actually double tap on them and zoom in and explore that specific object you should be able to zoom up to three times so for example, if you find a table, you can zoom into that table, feel the items on it, then you can tap on that item and explore that item etc etc.

By Enes Deniz on Thursday, November 13, 2025 - 15:28

Can language support be expanded beyond English? And why is Apple not among the options to sign in quickly? This is honestly surprising and somewhat strange as you're posting about this app on AppleVis, an Apple-focused forum, yet Google, Facebook, Microsoft and e-mail sign-in are all supported, while Apple is not.

By Brad on Thursday, November 13, 2025 - 15:31

I don't tink I'd use the web app a lot myself but, I could see this replacing BeMyEye's AI feature on windows.

Imagine you take a picture and if you want to hear the text,, you can, or if you want to explore it, you can with arrows and directional audio, that would be fun.

I'm thinking like reddit posts, so far a lot of the describer Addons for NVDA describe to much in my oppinion, if i go on r/shitamericanssay for example and get a screen shot described; i odn't want to know, it was written on tuesday, and hte post has 100 upvotes, I just want to get to the post.

I'll admit I won't use it much but it'd be nice as an NVDA adon for when i need it.

By Enes Deniz on Thursday, November 13, 2025 - 15:36

Here's what I got on my Windows computer: "Location unavailable. Make sure Location Services are enabled in iPhone Settings → Privacy → Location Services.". Since this is a web app and this is one of the features you highlight to promote it, in-app messages/responses should not provide specific references to one platform only.
PS: Here's another such message: "Important: Please turn off VoiceOver (iOS) or TalkBack (Android) before using exploration features. The app provides its own voice guidance.".

By Stephen on Thursday, November 13, 2025 - 15:47

We don't support Apple Sign-In yet, but you can easily sign up using your Apple email address (@icloud.com or @me.com) with the 'Email & Password' option! As for your other question, I’m currently working on adding language support…it’s just going to take some time to make sure it works the same as English. The reason it is posted on applevis is because 1, it’s universally accessible, so it works on all devices and this is where everyone finds out about new apps. 2, there is a reason I didn’t post this in the Apple specific forums.

By Stephen on Thursday, November 13, 2025 - 15:59

I built it for mobile but if folks want me to add coding for computers I can most likely do that. When I say universal, I mean, android and iPhone. It may take a bit of an overhaul vut I can probably do it. Let me look into it :). I do agree with you about that first message though. That’s probably lingering around there when I was trying to see if I could do a native iOS app, then do the same thing but make it a native android app. That should be a simple fix.

By Stephen on Thursday, November 13, 2025 - 16:13

Also that message about voice guidance is accurate. You do need to turn off VoiceOver or talk back to be able to explore your photos.

By Brian on Thursday, November 13, 2025 - 17:12

So I've noticed a small issue, when I go to explore any of my photos. Once I have loaded up a photo, and disabled VoiceOver on my iPhone, I will use the built-in speech engine to explore the image. When I'm finished, and back out, I go to turn VoiceOver back on, and I get a dialogue asking me to grant camera access. Note that I have already given the camera full access for this particular web application.

Is this a bug with the AI assistant, or an iOS/VoiceOver issue?