Hello guys,
I have created the free app PiccyBot that speaks out the description of the photo/image you give it. And you can then ask detailed questions about it.
I have adjusted the app to make it as low vision friendly as I could, but I would love to receive feedback on how to improve it further!
The App Store link can be found here:
https://apps.apple.com/us/app/piccybot/id6476859317
I am really hoping it will be of use to some. I have earlier created the app 'Talking Goggles' which was well received by the low vision community, but PiccyBot is a lot more powerful and hopefully useful!
Thanks and best regards,
Martijn van der Spek
Comments
I love the speed of the zhengdu screen reader but...
If anyone else wants to try it, you won't be able to navigate using headings, buttons, or any of those features.
Like I said, I love the speed of this thing,, it's so smooth, but if i can't use navigation keys in the public version then I'd not want to buy the more enhanced one.
Sorry for being off topic but I just thought I'd let other blind people know.
Off-topic: Zhengdu web navigation
Hi Brad,
You CAN now use those navigation features in the public welfare version with Chromium-based browsers (e.g. Chrome, Edge etc.). So this restriction was partly lifted.
For a heap of further information, please check your e-mail and you will find my detailed reply to all your questions. I did my best to answer them.
last off topic.
Thanks, I will do so.
DeepSeek
Laszlo, thanks for noticing the DeepSeek addition. It's the 7B model that I installed locally on one of my own servers. So not very powerful, as this server is not the best. It is more a proof of concept. One of the good things about it is that I have full control over it. I love open source and DeepSeek was clearly built on top of Meta's Llama, with a lot of smart optimisation steps.
The version I am running for PiccyBot only describes images, for video it will default to Gemini at the moment.
Now the stage is set.. With these kind of open source models available, it shouldn't be too expensive to train a model specifically tailored for blind and low vision use.
Another point is the censorship. At the moment the model will still walk the Chinese government rules and limit output that way. I am sure there will soon be models that will strip these restrictions. The current local model may be less censored as far as sexuality and such, still have to check that.
I have also updated PiccyBot, it should be more stable now, earlier it could get 'stuck' after many requests. It also includes a push notification to tell you when the processing of a video is finished. And you can minimise PiccyBot now while it is processing. It will play the description in the background even when you are continuing with another app.
Another development is the PiccyBot WhatsApp service. Particularly useful for Meta Rayban users who are banned from the 'look and tell' function. Sending a video or image to PiccyBot on WhatsApp will result in an audio description. Bit slow and somewhat clunky but at least it will enable handsfree video descriptions while wearing the glasses.
Good luck with the app guys, let me know how things work for you?
WhatsApp service
This sounds great - is it available now? If so, how do I use it?
Please a Mac Version
Please make it available on the MacOS. We need an app like this.
A model for VI
@martijn Exactly that's what I am excited about! even with deepseek, everything is open source and available out there isn't it? Speaking of which, what about Llama?
re: New update
thanks for the new update! I have tried it and can confirm that the audio will continue to play even when you lock your phone or go to another app. However, if I lock my phone or minimize the app and go to another app while processing, it seems to stop processing because when I come back to the screen, all it shows me is retry and no description was generated.
incidentally, I don't know. Have anyone requested this feature yet, but it would be nice to have a setting where we can set the app to auto retry when description fails or fail to mix audio, etc., waiting for 4:5 minute and then only to come back and have to manually hit retry again and again gets a little tedious. especially now, if the goal is to allow us to have it processing in the background, it makes sense if it would auto retry when fail. Maybe not indefinitely? Maybe auto retry five times or something and then sent a notification that says it has failed five times, please check the video, or something like that?