VOLlama v0.1.1, an open-source, accessible chat client for OLlama

By Chi Kim, 26 April, 2024

Forum
macOS and Mac Apps

Happy Friday!

I'm excited to announce the release of VOLlama v0.1.1, an open-source, accessible chat client for OLlama. This client leverages open-source large language models to enable local conversations without internet for privacy.

Many user interfaces for open-source large language models are either inaccessible or annoying to use with screen reader, so I decided to create one for myselfs and others. I hope that ML UI libraries like Streamlit and Gradio will become more friendly with screen readers in the future, so making apps like this is not necessary!

Running an open source model locally requires high computing power. I recommend at least 16GB of RAM and a Mac with an M1 chip or later.

However, it doesn't require much computing power if you just want to use OpenAI GPT models or Google Gemini models with api keys.

To install Ollama, you'll need to use the Terminal, but chatting does not require terminal. The app is not notarized by Apple, so you need to allow to open from the system settings > privacy and security. Unfortunately it takes a little while to open, so you need to wait after you open. I'm looking into improving the opening time.

It has various features, including generating image descriptions with a multimodal model like Llava and the ability to process and query long documents with RAG feature. There are numerous settings available for power users as well. It also supports models from OpenAI and Google Gemini if you have an api key.

If it sounds interesting, please download VOLlama and follow the instruction.

Hope you enjoy and spread the news!

Options

Comments

By Mlth on Tuesday, April 23, 2024 - 17:34

This is very cool! Thank you for making this
I'm unfortunately on an intel mac, but I'll see if I can give it a spin

By Quinton Williams on Tuesday, April 23, 2024 - 17:34

This is very neat! Thank you so much for creating it. Running the models locally seems to work as expected, however I get this error whenever I try using GPT.
Is there something I'm not doing correctly?
To clarify, I've provided my api key. Er
ror code: 400 - {'error': {'message': 'max_tokens is too large: 8192. This model supports at most 4096 completion tokens, whereas you provided 8192.', 'type': None, 'param': 'max_tokens', 'code': None}}
Traceback (most recent call last):
File "Model.py", line 162, in ask
File "llama_index/core/llms/callbacks.py", line 150, in wrapped_gen
File "llama_index/llms/openai/base.py", line 439, in gen
File "openai/_utils/_utils.py", line 277, in wrapper
File "openai/resources/chat/completions.py", line 581, in create
File "openai/_base_client.py", line 1232, in post
File "openai/_base_client.py", line 921, in request
File "openai/_base_client.py", line 1012, in _request
openai.BadRequestError: Error code: 400 - {'error': {'message': 'max_tokens is too large: 8192. This model supports at most 4096 completion tokens, whereas you provided 8192.', 'type': None, 'param': 'max_tokens', 'code': None}}

By Quinton Williams on Tuesday, April 23, 2024 - 17:34

I apologize. After playing around a bit, I was able to adjust the correct parameter in order for gpt to work correctly.

By TheBllindGuy07 on Tuesday, April 23, 2024 - 17:34

And I thank the author for interesting me again in local ais, it works great in terminal actually and I love ollama so simple than the gist I was following at that time with a couple of compilation with make and weird stuff like that... Great post and great app!

By ming on Tuesday, April 23, 2024 - 17:34

does it work with windowsPC?

By Ollie on Tuesday, April 23, 2024 - 17:34

How do we get the higher quality voices? All the options in the seem to be compact.

By Ollie on Tuesday, April 23, 2024 - 17:34

I'm also having issues with open AI. Getting an error. I've got a api key in there, is that all I need to do or are there further steps?

By Chi Kim on Tuesday, April 23, 2024 - 17:34

Ming, it works with Windows as well.

Ollie, most likely you're using a gpt model that doesn't support 8192 context length which is default for llama3 model. Try go to advance menu > generation parameters and set num_ctx to 4096.

By mr grieves on Tuesday, April 23, 2024 - 17:34

This app seems really interesting. Thanks very much for sharing it with us.

I noticed I can copy/paste an image into the text area (or use the attach command) but I am told that it can't describe images. If I use nomic-embed-text then I get a 404 error. I did follow the instructions and installed both models. Is the 2nd one responsible for images or do I need something else?

I often get sent screenshots at work which I don't really want to send over the internet. I tend to use the OCR built into Smultron which is sometimes helpful, but it would be great if I could locally query AI about it.

By Chi Kim on Thursday, April 25, 2024 - 17:34

For image description, you need to download a different model called llava. It has 3 different variants. The command "ollama pull llava" downloads the 7 billion parameters model. Then there are llava:13b and llava:34b. Higher you go, the accuracy increases, but it takes up more storage an more computing power, and response speed gets slower.
Once you download it, choose the model from the VOLlama inside toolbar (or command+l.)
Then you attach an image from the chat menu (or command+i.) Then you can just ask a question like can you describe the image.
Also, it has very limited capability of OCR. It's more for like scene description.
If you need OCR, I recommend VOCR, another app I developed specifically to process screenshots with OCR.
https://github.com/chigkim/vocr/releases

Lastly the, nomic-embed-text is an embedding model used for RAG to process documents. You cant chat with an embedding model directly.

Hope that helps.

By Ollie on Thursday, April 25, 2024 - 17:34

Wait! You're the one who made VOCR!!! I love you!!!

Seriously, I don't know where I'd be without VOCR. I use it all the time for my 3D printing and all the myriad of other apps that aren't accessible. Thank you so much for your work! Apple needs to hire you and put you in charge, and give you a gold hat.

By Chi Kim on Thursday, April 25, 2024 - 17:34

Thanks for the kind words! I'm glad that you find it useful!
This year I added bunch of features to VOCR, so if you haven't tried the beta version, you should try it! :)
It has new menu, real time ocr, object detection, AI image description through ollama, OpenAI, etc.
Honestly, VoiceOver should have these feature built-in, so I don't have to work on it.
If anyone has connection to Apple accessibility team, please tell them about it. lol

By matt on Thursday, April 25, 2024 - 17:34

massive newb question here. wanting to try this out and cant install the models! I put VOLama in the applications folder, opened teminal and typed ollama pull llama3, and it says command not found. feeling a bit silly here haha, help!

By Ollie on Thursday, April 25, 2024 - 17:34

Yes, I'm all over the 2.0 beta. It's very slick. If you are happy for me to do so, I'll drop apple accessibility a line, it is exactly what they should be doing, trouble is, anything like this is partially admitting that it's needed, IE, their accessibility framework has gaping holes.

By Ollie on Thursday, April 25, 2024 - 17:34

Any way you know of getting the higher quality siri voices on this? I'm not sure if there is a limitation of 3rd party apps accessing the built in voices.

By mr grieves on Thursday, April 25, 2024 - 17:34

Thanks for clearing those things up. For some reason I never thought to use VOCR to read screenshots, but presumably I could ask it to take a grab and then interact with it. I think there are some options to view images in Finder, but mostly on my Mac they would be images in emails or Jira tickets so it's a little easier if I don't have to save it somewhere first. I'll give that a try next time, thanks. I know there is also AI integration with VOCR which I've still not really played with yet (but should do).

VOCR is one of those apps that I don't use all that often, but when I do it's such a godsend. So thanks very much for both apps.

By Chi Kim on Thursday, April 25, 2024 - 17:34

mr grieves, with new Beta VOCR, you don't need to save the screenshot into file. You can just do it from browser. Move your VOCursor to the target image and run OCR on VOCursor with control+command+shift+v.
People are doing all kinds of things with this: like extract text from video, asking what's going on in Youtube video, etc.
with real time OCR, you can even read live caption in real time without scanning over and over.

Ollie, Unfortunatley I haven't a way to access higher quality voices in Python. Also feel free to give Apple accessibility team a line. They already have screen recognition on iOS, so they just need to port it over to OSX!

By mr grieves on Thursday, April 25, 2024 - 17:34

Ah yes I did see something like that. I don't think at the time I quite appreciated what I was going to use it for.

I will definitely give that a try next time I come across some random image that I can't make sense of. Thanks very much for pointing it out. Oh and developing it too of course!

By matt on Thursday, April 25, 2024 - 17:34

aaaaaa yep. definitely a newb haha. for some reason i had it in my head that terminal would install it since i had VOLLama installed lol thank you

By ming on Thursday, April 25, 2024 - 17:34

hi!
can someone send me the link for VOLLama ...
I just get the chat client but, seems it can not work..

By ming on Thursday, April 25, 2024 - 17:34

meanwhile...
if I am using AMD computer computer that I brought in 2021..
does it still work?

By matt on Thursday, May 2, 2024 - 17:34

So this is bloody cool! do any of the models generate images? I couldn't see any obvious one in the list.

By Chi Kim on Thursday, May 2, 2024 - 17:34

For image description, Ollama supports two multimodal (vision language) models: Llava and Moondream which was added yesterday.

By Chi Kim on Thursday, May 2, 2024 - 17:34

No image generation. Sorry, I didn't read it carefully. Ollama doesn't supports any model that can generate images.