Hey guys!
I would like to try using the iPhone to play. For this, since using Gemini uses up a lot of tokens quickly, I would like to ask how good local models are for describing images and if it is possible to make a shortcut for this.
The idea would be the following:
I press a button on the controller. The ption change. I make a gesture on the iPhone screen with VoiceOver. silently, it takes a screenshot of the screen, sends it to llm with a specific prompt, speaks and deletes the prompt.
Do you think it would work? Find out which option is in focus, player status, among others.
By Diego, 11 May, 2026
Forum
iOS and iPadOS