I'm a CS and physics student at UIUC doing spatial AI research, and I'm trying to map where the current navigation and object detection tools fail in practice, since this group knows about far better than me!
My read of the current state is that the VLM-based tools (Be My AI, Seeing AI, the Be My Eyes Meta glasses integration) are strong at describing a scene but weak at the spatial part, things like how far, which direction, what's behind what, and that the tools offering real spatial precision usually depend on pre-scanned environments or a fixed object vocabulary, so they fall apart in unfamiliar or dynamic spaces. I want to know whether that framing holds up against your actual use.
Specifically, 1. when was the last time one of these tools let you down and what happened, 2. whether indoor or outdoor navigation is the harder problem for you, 3. whether holding a phone versus wearable glasses changes use something day to day.
Comments
Indoor navigation
I think there's a few apps that **can** do indoor navigation if you have a Pro model iPhone, like SeeingAI. I never tried it on my 12 Pro so I don't know if it's any good, I now have a regular 16. But I'd love something that can do indoor navigation, either something like Soundscape and Voice Vista, telling you what stores you're walking past, or have something come through Meta glasses or similar.
And yes, the Meta glasses aren't too wonderful at saying how far something is. For example, I can get as far as "Is there a bin nearby?" and it'll tell me that, but it's up to me to clonk around and find the bin.
Last 50 feet and in door navigation
The last 50 feet outdoors, and in-door navigation are the classis and still mostly unsolved problems. I understand there is an iBeacon system (I think that's its name) that is trying to help basically make GPS for in-door but since it requires installing them we know how well that will go for anything less than an airport.
And outdoors GPS navigation apps are pretty good but I can't navigate an unmarked roundabout (meaning no sidewalks or tactile crossing) with them either. I feel it about needs a vision-based system. Maybe I just need a Waymo car, the only thing that seems able to do it autonomously.