New AI app for describing images and video: PiccyBot

By Martijn - Sparkling Apps, 1 March, 2024

Forum
iOS and iPadOS

Hello guys,

I have created the free app PiccyBot that speaks out the description of the photo/image you give it. And you can then ask detailed questions about it.

I have adjusted the app to make it as low vision friendly as I could, but I would love to receive feedback on how to improve it further!

The App Store link can be found here:
https://apps.apple.com/us/app/piccybot/id6476859317

I am really hoping it will be of use to some. I have earlier created the app 'Talking Goggles' which was well received by the low vision community, but PiccyBot is a lot more powerful and hopefully useful!

Thanks and best regards,

Martijn van der Spek

Options

Comments

By Laszlo on Monday, January 26, 2026 - 19:39

It exists at least sinceautumn 2024 (when I purchased mine), but most probably since much earlier. There was a price rise for this in summer 2025 due to increasing costs. If I am not mistaken, its current price is 24.95 USD. In my opinion that's not just a very fair price, but a truly outstandingly moderate one indeed. In my home country, Hungary (a Central-European country), which is far from being among the richest countries, a single meal composed of two dishes costs usually more at a restaurant that is even in the lower middle price range. For another comparison example, my one-month subscription fee for home TV + landline phone + Internet (1 Gbps theoretical bandwidth) in a package currently values at nearly the same if I express that fee in US dollars.

By LaBoheme on Monday, January 26, 2026 - 19:44

wow! you can ask more from the main screen, there is a text field at the top, the only problem is you can't add pictures, so your conversation is limited to the initial photo you provide. the only time you must use the ask more button is when you need to upload additional picture. my suggestion was to replace the ask more button with the attach more button, like you find on the ask more screen. if you just use the good old camera button, it starts a new session just like before.

now, you might not be use to the new setup in the begining, but why is it so bad as it streamline the whole process?

By Laszlo on Monday, January 26, 2026 - 20:40

Yes, of course, I am aware that you can use the text field on the main screen for asking anything. But "Ask more" works differently, and this difference can be significant in some cases.
What does asking from the text field on the main screen do (in my understanding and experience)? It feeds the image or video together with that single question to the model in use (set in settings if one is subscribed). That is called an initial prompt in AI terminology. What happens if you ask another question there (again in my experience and understanding)? It resends the image or video and the new question to the model and gets the answer. The emphasis is on the "new question": the model won't see your earlier question this time! That is the key difference to "Ask More". If you use "Ask More" the model will see all your earlier questions and its responses as a whole chat together with the image or video which you are asking about. Why this difference can be important? Because LLMs are very very context sensitive. They may respond quite differently if you ask two questions in a single chat (like with "Ask More"), than if you send your two questions separately (like when you do it from the main sccreen). You may not perceive this difference in each and every case so noticably, it depends very much on the video or image and your questions. But there is definitely a difference which I very well know from my wide-spread usage of Piccybot. Using "Ask More" produces much more usable, practical, and to-the-point answers in my experience in almost all cases compared to the case when I would ask them from the main screen.
But please don't just take my word for it, LaBoheme, you can experiment freely and compare your results if you ask multiple questions from the main screen and from "Ask More" about the same image or video.
I also encourage Martin to chime in to ascertain that my understanding of these two features is correct or not.

By Brian on Monday, January 26, 2026 - 21:17

Why not include the attach more with the ask more interface? This is what Be My AI does, and I find it works very well on iOS. Not that PiccyBut has anything to do with Be My Eyes, but the recommendation stands.
Just my two cents...

By Bruce Harrell on Monday, January 26, 2026 - 22:02

I'm guessing they don't combine the two for the reason they explained. Inconsistency. I think I'd rather know what I'm getting in to when I choose one option or the other, as opposed to taking my chances when I choose a combined option. Then, too, why not offer all three?

On the other hand, I'm a bit confused by all the AI models or profiles or options or whatever they're called. Why so many? Why not more? What distinguishes one from another when arrowing through the list? Unless I missed something, they aren't even numbered.

And why is the send button so far from the text field? Why not put it right next to the text field?

I apologize if my questions have already been addressed. I only just dove into this very long list of comments and questions. Smile. Nice looking app, though.

I like the lifetime subscription price. I'm tempted. Does it include all upgrades, too?

By Laszlo on Monday, January 26, 2026 - 22:36

Brian, "attach more" is already integrated into "ask more". It is there for a while. On the "Ask more" screen, you have three options besides sending your follow-up questions of course. They are: copy, share and attach image. The third does what you point at. This way you can feed multiple images to the model together with multiple questions which can relate to any of the images or their relations etc. The important point is that if you use "ask more" your whole conversation is fed to the model as a single chat together of course with the images.
Bruce, yes, of course, the lifetime price includes all updates in the whole lifetime of the app, that is what this word means. I jumped on this vagon for lifetime Piccybot subscription around version 2.13 or something like that and still get the latest 2.42 of course.
to everybody and Martin: in my opinion one of the unique strengths of Piccybot compared to other description apps (and there are quite many out there nowadays) is its extreme configurability (provided that you have subscription) and its versatility. The variety of images and videos combined with user preferences and needs (degree of sight loss, blind from birth or not etc. etc.) is infinite. I see the only way to match this variety is to give users many many options to finetune their usage to their specific need, and that's what exactly the subscribed Piccybot does, and I frankly praise Martin for this approach. Yes, we are all different. Some do not need a whole array of models, some do. Some are fine with the initial description, some want to dig deeper. And I could go on and on. Piccybot lets us users to choose our way, and that is great in my opinion. Yes, it may take some more work to see through some options, but it is the case with literarily every app. Freedom and choices walk hand in hand with investing a bit more work on the user side in some cases, but in my opinion it is well worth it. But does somebody like the default settings? Totally fine, it is possible. Does somebody want to experiment with different models to feel the nuances of an image or especially of a video? (video description seems to differ much more between models). Fine, it is there also! All in all, this versatility of Piccybot is a value I strongly vote for to keep by all means.

By Brian on Tuesday, January 27, 2026 - 16:24

If both the ask more, and the attach more are already together, then why are users asking for an attach more option? I think I am missing something here.

By Laszlo on Tuesday, January 27, 2026 - 17:27

It is quite obvious from the posts that our experience level with Piccybot, our preferences, our ways of working with the app etc. etc. differ much from person to person. That is natural. Some folks (like me) find the current working of Piccybot versatile, intuitive, straightforward and practical, and some folks would prefer other workflows, which they personally would find more streamlined for them. It's a very hard task for Martin to make decisions, choices and adjustments in this very diverse user base. I don't envy his task for a single moment. However I praise his attitude that he wants to give various options to the users, which I think is pretty much the only way to go in such a diverse user base even if it introduces some added complexity both on the developer side and sometimes on the user side too.

By JC on Tuesday, January 27, 2026 - 18:54

This is an awesome app! I have downloaded the app and purchased the lifetime subscription and it is working very well. I was able to send a video from YouTube to the app and it read clearly what was displayed in the video. Keep up thr amazing work, and looking forward to seeing what's next in future.

By Michael on Wednesday, January 28, 2026 - 21:06

the ask more bug can kick in after only a single follow up question or after several. Unfortunately it isn't consistently every time when it happens but it is a major bug that is preventing me from enjoying the app as intended.
Having said that, I do appreciate how difficult it is pinning down the bug and addressing it.

By Brian on Thursday, January 29, 2026 - 00:53

Thanks, I appreciate your input and explanation. 🙂

By Michael on Monday, February 2, 2026 - 18:26

When I now ask for a follow up question, I'm getting the following error message,
Access blocked due to unauthorized access. Please download the official picksybot app from the app store.

By Dave Nason on Monday, February 2, 2026 - 21:24

Member of the AppleVis Editorial Team

Hi Martijn,
I was looking at the Subscription screen as I’m thinking of upgrading. I noticed that all of the benefits listed on that screen are appearing, or at least being spoken by VoiceOver, three times each. Bit weird.
I also wondered; does the premium version include a conversational interface via voice? I’m thinking like Envision Ally? I like Ally’s UX in that sense, but find I get much better results from PiccyBot, so this would be great. Not to replace typing, but as an alternative you could use when the situation fits.
Final question; settle a debate for me… is PiccyBot pronounced “PixieBot” or “PikkyBot”? ☺️
Dave

By Michael on Monday, February 2, 2026 - 21:57

Hello Dave,
Personally I would hold off on purchasing until the developer has addressed the follow up bugs I've identified.

By Martijn - Sparkling Apps on Tuesday, February 10, 2026 - 03:55

Michael, it has been a difficult issue to reproduce and therefore to fix, but I hope it is all working now in the latest update released today. Please try it out and let me know?
I have also added extra security to reduce abuse of the PiccyBot models by outside parties. This may introduce additional glitches, but hopefully everything goes smooth.

By Michael on Wednesday, February 11, 2026 - 22:42

Hello,
It looks like the ask more issue has finally been resolved. My huge thanks to Martijn for addressing this. I realize it was not an easy one to track down and I truly appreciate your time and effort in addressing this.
I noticed when it came to language models that the language model selection is not being saved when one exits the app.
Is this a result of the added security enhancements?

By Martijn - Sparkling Apps on Tuesday, February 17, 2026 - 04:23

Michael, thanks. It was indeed due to the additional security as other apps were reading the PiccyBot model settings. I have released an update today that should save the model selection and any preferences properly again. Make sure you update to the latest version and let me know if there are any further issues?

By Martijn - Sparkling Apps on Thursday, March 5, 2026 - 07:08

Guys, I have been a bit busy with protecting the PiccyBot API recently. It was being abused by outside services. Hopefully it should all be perfectly secure soon.

I did manage to add a few new models to the list, notably Gemini 3.1 Pro and Gemini 3.1 Flash Lite. The last model is actually very useful for video descriptions as it has surprising good quality and is a lot faster than the other models. You can try them all out as pro user but the base free version of PiccyBot has benefited from the improved video descriptions.

By Martijn - Sparkling Apps on Saturday, March 14, 2026 - 08:44

In the latest update I have added pdf description support. You can either load or share any pdf and receive a summary. The length of the summary depends on the length setting you have set. You can then ask further questions about details of the pdf, like you can with video or image descriptions.
I have also added support for several extra languages, especially Indian ones. I am using the new Sarvam AI model for proper pronunciation of these languages.
Work on live AI and integration with smart glasses is still ongoing. I had to deal with adding extra security to PiccyBot to avoid outside parties using the API without permission.

By blindpk on Saturday, March 14, 2026 - 09:28

The document functionality is a great addition. One thing though, the button for what seems to be the documents view does not appear to have a label, my VoiceOver just presents it as a button with an image.
Another thing, which is very much more of a personal opinion, is that I think the model selection is getting a bit too large. I really, really like the ability to select from many models, that is one of PiccyBot's big selling points for me, but especially the OpenAI and Google models are quite numerous and it is getting difficult to really understand the differences between them and when they should be used (the descriptions have helped, but some are still very similar).

By Laszlo on Saturday, March 14, 2026 - 14:32

Yes, I can confirm it too that the last button in the main interface (in flick order) which is for browsing a .pdf to be described is unlabeled. I haven'z had the chance to try that new feature on anything yet, but Martin thanks much for that, it may come very handy at times.
Blindpk, you may find the model descriptions very similar because there was a regression with those descriptions from before Piccybot version 2.45. In plain terms that means that before version 2.45, model descriptions were significantly better and provided better guidance than they do now. With version 2.45, some models were erroneously assigned the same description, so they are much harder to distinguish than before. One example: the description for Claude 4.6 Sonnet used to be "good for emotions" before Piccybot version 2.45, but now it says something quite general, picked up from some other description. The former description was very adequate based on my widespread experience.
So Martin please fix this regression and correctly reassign the former, more meaningful descriptions from before version 2.45. While you are at it, you can easily fix another regression: in 2.43 and 2.44 the first button (in flick order) on the settings screen was correctly labeled as "Settings help" or something similar. Now it is back again to that erroneous "badge question mark" or something like that, which persisted for a long time. Thanks in advance.
Me personally am not bothered by the length of the model list at all, but I catch your point, Blindpk regarding the OpenAI and Google model families.
Based on my very extensive usage of various models and my info on the pricing of those (it is an important point to consider for Martin), I can come up with the following simplification proposal for those two model families for Martin to consider eventually:
OpenAI: keep gpt-5-nano, gpt 5.1 and either gpt 5.2 or 5.4. I know that the latest gpt 5.4 has a quite hefty price tag, and I am not entirely sure whether the slight quality improvement justifies that fully. In my experience gpt 5.1 performs much better than gpt-5.1- chat, that's why I vote for that variant.
Google: keep Gemini 3.1 Pro (image only), Gemini Flash Lite 3.1 (image / video) and Gemini 2.5 Flash (if I am not mistaken the default for videos for a long time, quite moderately priced and quite snappy)
All other models have a definitive role in the model list and if the earlier descriptions are restored, then it won't be hard at all to choose. So I would keep them all by all means: the two Amazon models, the two Claude models, the quite uncensored Grok, the privacy-providing Llama 4 Maverick and Mistral Pixtral, the special native blind style and Piccybot Mix and finally Reka which really shines for painstakingly detailed, rather emotionless, but technically oriented descriptions.
One last note for everyone: the newest version of a model is not always the best by far. Model evolution is not linear at all, and quite often a newer version of a model catches up a lot of silly and annoying hallucinations and other rubbish in its training process. That is called model overtraining or something like that. So it is worth to keep at least two versions of a model on the list, in case the newer onetalks nonsense.

By mr grieves on Saturday, March 14, 2026 - 17:15

I have to agree that I both love having all the millions of options available but also don't have a clue which one I am supposed to be using. I had a play when the app first came out but since then I more or less stick with whatever it is.

I wonder if it might make sense to have a few preset choices, so we could select, for example, the most detailed one for scene description, or one that's good for brief descriptions, or best one for describing people or whatever the options might be. Then for those in the know they can select from the full list, but for the rest of us we know we will just be bumped up to whatever is most appropriate for the task.

Maybe this isn't practical but just a thought.

I should also say that the PDF option sounds like yet another great addition and I will give it a go next time I come across an inaccessible PDF.

By Laszlo on Saturday, March 14, 2026 - 18:21

I know ahead of time that some will disagree with what I will bring here to the table in this post, but that is totally fine, that's how arguments are going. But this "the model list is too long" debate has been coming up in this very long thread so many times that I can't stand bringing something to the attention of the community.
There is a tool called Luomo toolbox. It is developed by a single Chinese developer. It has an English localisation also (mostly, there are little bits of untranslated parts here and there), but totally usable for "Westerners" also. You can easily find it on the appstore.
I have it on my phone also together with Piccybot, Be My Eyes, Seeing AI etc., so to say together with all the "vision loss compensation" apps. Luomo Toolbox offers partly almost the same functionality as Piccybot: it can among other things describe pictures and videos, take guided photos, manage custom prompts etc. I am in such living circumstances where regular sighted help is a scarcity, so there's not really a thing like there is "enough" of these apps for me, as having multiple backup options if anything goes wrong is an absolute must in this situation.
The image description model list of Luomo Toolbox is at least twice as long as the model list of Piccybot. It is implemented as a slider which you can adjust the model with by flicking up or down. There is no search box as in Piccybot. Neither are there any model descriptions of any sort there in Luomo Toolbox. And there is a simplified multimodel chat assistant interface in Luomo Toolbox too, and believe me, its model list is at least three times or even longer than in Piccybot. It can also be adjusted like I said before, without any search box or model descriptions. Of course there are tons of similar models on those lists too, and multiple versions of a given model.
What I want to get at is that I have never ever read any comments that would complain about the length of those model lists or how hard to choose from them or the like. I know this because I regularly and extensively follow Chinese forums also with the help of the translation feature of the ZDSR screen reader. That's how I have come to know about Luomo Toolbox at all. And that attitude difference made me think a bit.
Some would say that the Chinese society is very different to the Western ones. It's true that there are differences, but the very detailed image I get from all the forum posts about the everyday life of the blind in China is surprisingly similar compared to the West. They are also tend to be quite picky with all assistance tech, and believe me, they throw at least as much requests, wishes and complaints at the devs of all sorts like it is seen here. So that is not the reason why they don't complain about the model lists in Luomo Toolbox. The reason is quite different.
In Luomo Toolbox in order to use most models you have to buy points through in-app purchases besides needing a monthly "membership subscription". And there is no such thing there as a lifetime subscription as in Piccybot. So Luomo Toolbox users need to learn to economize with model usage depending on their budget. And similar models can cost very different amounts of points there, so the task is not easy. SoLuomo Toolbox users learn the "art of choosing models" by the help of their purse!
Here we seem very much "spoiled" as Martin does all the economizing with the inference costs for us. And I think we should be extra extra glad for him for that, as we get a very decent selection for a fixed monthly price or even an one-time lifetime price!!!
And finally I have a quite appropriate choice suggestion for those who want a (mostly as it goes)"all-capable" model without much setting up, because it is already on the list I think. If someone is blind from birth "native blind style" is advisable for that, and if not or if the real visual details are needed then "Piccybot mix" is a very good candidate. And for others comfortable with the list, it is still there.

By blindpk on Saturday, March 14, 2026 - 21:09

Ah, thank you for the clarification, I wasn't aware that the descriptions had changed in the latest versions (hadn't checked them in a while), that would explain the, currently, a bit confusing list.
I don't see a problem with a long list as long as the things included also have a point of being there, and as you pointed out, most of the current entries on that list have that, so this is not really a big thing for me, but I want the choices to be clear to everyone. If e.g. economy is a factor that is important to the developer, then that also could be made clear to the user so they could make an informed choice.

By Enes Deniz on Sunday, March 15, 2026 - 14:08

I did suggest various options that would help lower the costs for the developer, including Apple Intelligence as an alternative or at least fallback option in situations where the user does not have a stable connection. This would let the user get a decent description instead of an error even without an Internet connection. Adding support for system voices, or even integrating Piper voices into the app using the RunAnywhere SDK could be other options, though we now have a dedicated Piper TTS app. I don't know how much it costs the developer to pay for everything including the models and voices, but one annoying thing is that I was trying to come up with a solution that would help us and him alike, but haven't received a reply from him. And another problem is that the app still requires you to disable not only VPNs but also ad blockers or anything that installs VPN profiles so that the list of available models get updated. Can't this just be included within the app without the need to disable your VPN in order to view and choose models? I even thought that the app could be broken, and had to uninstall and reinstall it, because this was an issue I had reported earlier, and the developer had said he would fix it. I might be wrong about that, but he might even have said he had fixed it. So I am happy with how the app works in general, and I do acknowledge the fact that the revenue earned from our purchases may not be enough to fund all the payments, including models and voices, but also fees charged by Apple itself, but it is undeniable that the developer is unfortunately slow to implement certain fixes that should make things easier for everyone. So he recently added a new option to have PDFs described, and the button is unlabeled. This doesn't really add up. He develops an app designed specifically blind and low-vision users in mind, adds a new function to it, and doesn't label the button properly. So I still appreciate the fact that the developer takes his time to go through our comments and respond to them, but it's still quite difficult for me to say the app offers a seamless experience. And it's a paid app we're talking about. We can't even input a customizable system prompt, while we should actually be able to create multiple profiles for various types of things to be described. We can't even enter a single, generic system prompt though, and this is also one of the features I suggested so long ago.
PS: I forgot to mention the one thing that the title suggests: We should have as many models available as possible, but it doesn't necessarily have to be several variants of one single model. So it would be interesting to have other options like DeepSeek, Qwen and Kimi, and this is yet another thing I suggested quite long ago. The developer stated in response that he wouldn't be able to pay for all the models at once, which is understandable.

By Martijn - Sparkling Apps on Tuesday, March 17, 2026 - 03:38

Hi guys, there was an issue with PiccyBot related to the SSL certification renewal. It should all be working fine with the latest update. I also fixed the button description and cleaned up the model list somewhat.

By Martijn - Sparkling Apps on Friday, March 20, 2026 - 04:39

The latest update improves the AI model access and should reduce the issues with certain regions blocking Firebase.
There was an issue with uploading multiple images under 'Ask More' on iPad, which has been resolved.

For those interested in PiccyBot's issue with outside party API use, check out this blog (in Russian) about a tool that now no longer has access to the paid models. Apparently PiccyBot was among the apps funding AI access to 150.000 users: https://visionbot.ru/vision_bot_closure.html.

By Martijn - Sparkling Apps on Saturday, March 28, 2026 - 08:19

I have been testing out connecting the Meta glasses to PiccyBot and running Live AI on them. It is definitely not perfect but a handsfree, almost real time description of your surroundings, for your ears only, with the option to interrupt any time and ask for details, is really a big step forward in my opinion.

https://youtube.com/shorts/Se7mlXnNrUc?feature=share

By Dave Nason on Sunday, March 29, 2026 - 08:36

Member of the AppleVis Editorial Team

Hi. I subscribed to premium yesterday and just have a couple of questions.
1. Is PiccyBot available in the iOS share sheet? I’m not seeing it there. This means that if I’m in a WhatsApp chat for example, or even in the Photos app, I can’t simply get a description from PiccyBot. Instead I have to go over to the PiccyBot app and try to find the photo in my gallery there. Surprised and little disappointed if this is the case.
2. When the PiccyBot voice is speaking out a description, is there a way to shut it up? Sometimes I don’t want it, or have gotten enough info before it’s finished.
3. Not a question, but just a heads up that the subscription sign up screen was still a bit of a mess with VoiceOver, as I reported before ⬆️
Cheers,
Dave

By BlindFolk on Sunday, March 29, 2026 - 09:00

Yes dave. Piccy is definitely available on the IOS sharesheet. You either have to choose more, or enable it on the additional options.

By Dave Nason on Sunday, March 29, 2026 - 09:13

Member of the AppleVis Editorial Team

Thank you. I found it now. It was in the app section rather than the actions section, which is why I missed it.
I also found the pause button; not sure how I missed that before 🙈
Apologies and thanks.
Dave