VOLlama v0.1.1, an open-source, accessible chat client for OLlama

By Chi Kim, 26 April, 2024

Forum

macOS and Mac Apps

Happy Friday!

I'm excited to announce the release of VOLlama v0.1.1, an open-source, accessible chat client for OLlama. This client leverages open-source large language models to enable local conversations without internet for privacy.

Many user interfaces for open-source large language models are either inaccessible or annoying to use with screen reader, so I decided to create one for myselfs and others. I hope that ML UI libraries like Streamlit and Gradio will become more friendly with screen readers in the future, so making apps like this is not necessary!

Running an open source model locally requires high computing power. I recommend at least 16GB of RAM and a Mac with an M1 chip or later.

However, it doesn't require much computing power if you just want to use OpenAI GPT models or Google Gemini models with api keys.

To install Ollama, you'll need to use the Terminal, but chatting does not require terminal. The app is not notarized by Apple, so you need to allow to open from the system settings > privacy and security. Unfortunately it takes a little while to open, so you need to wait after you open. I'm looking into improving the opening time.

It has various features, including generating image descriptions with a multimodal model like Llava and the ability to process and query long documents with RAG feature. There are numerous settings available for power users as well. It also supports models from OpenAI and Google Gemini if you have an api key.

If it sounds interesting, please download VOLlama and follow the instruction.

Hope you enjoy and spread the news!

Options

Comments

This is very cool! Thank you…

This is very cool! Thank you for making this
I'm unfortunately on an intel mac, but I'll see if I can give it a spin

strange openAI model behavior

This is very neat! Thank you so much for creating it. Running the models locally seems to work as expected, however I get this error whenever I try using GPT.
Is there something I'm not doing correctly?
To clarify, I've provided my api key. Er
ror code: 400 - {'error': {'message': 'max_tokens is too large: 8192. This model supports at most 4096 completion tokens, whereas you provided 8192.', 'type': None, 'param': 'max_tokens', 'code': None}}
Traceback (most recent call last):
File "Model.py", line 162, in ask
File "llama_index/core/llms/callbacks.py", line 150, in wrapped_gen
File "llama_index/llms/openai/base.py", line 439, in gen
File "openai/_utils/_utils.py", line 277, in wrapper
File "openai/resources/chat/completions.py", line 581, in create
File "openai/_base_client.py", line 1232, in post
File "openai/_base_client.py", line 921, in request
File "openai/_base_client.py", line 1012, in _request
openai.BadRequestError: Error code: 400 - {'error': {'message': 'max_tokens is too large: 8192. This model supports at most 4096 completion tokens, whereas you provided 8192.', 'type': None, 'param': 'max_tokens', 'code': None}}

disregard my last comment

I apologize. After playing around a bit, I was able to adjust the correct parameter in order for gpt to work correctly.

Haven't tried vollama but ollama

And I thank the author for interesting me again in local ais, it works great in terminal actually and I love ollama so simple than the gist I was following at that time with a couple of compilation with make and weird stuff like that... Great post and great app!

cool.

does it work with windowsPC?

How do we get the higher…

How do we get the higher quality voices? All the options in the seem to be compact.

I'm also having issues with…

I'm also having issues with open AI. Getting an error. I've got a api key in there, is that all I need to do or are there further steps?

Works with Windows

Ming, it works with Windows as well.

Ollie, most likely you're using a gpt model that doesn't support 8192 context length which is default for llama3 model. Try go to advance menu > generation parameters and set num_ctx to 4096.

That's got it, thank you.

Describing images

This app seems really interesting. Thanks very much for sharing it with us.

I noticed I can copy/paste an image into the text area (or use the attach command) but I am told that it can't describe images. If I use nomic-embed-text then I get a 404 error. I did follow the instructions and installed both models. Is the 2nd one responsible for images or do I need something else?

I often get sent screenshots at work which I don't really want to send over the internet. I tend to use the OCR built into Smultron which is sometimes helpful, but it would be great if I could locally query AI about it.

For Image Description

For image description, you need to download a different model called llava. It has 3 different variants. The command "ollama pull llava" downloads the 7 billion parameters model. Then there are llava:13b and llava:34b. Higher you go, the accuracy increases, but it takes up more storage an more computing power, and response speed gets slower.
Once you download it, choose the model from the VOLlama inside toolbar (or command+l.)
Then you attach an image from the chat menu (or command+i.) Then you can just ask a question like can you describe the image.
Also, it has very limited capability of OCR. It's more for like scene description.
If you need OCR, I recommend VOCR, another app I developed specifically to process screenshots with OCR.
https://github.com/chigkim/vocr/releases

Lastly the, nomic-embed-text is an embedding model used for RAG to process documents. You cant chat with an embedding model directly.

Hope that helps.

Wait! You're the one who…

Wait! You're the one who made VOCR!!! I love you!!!

Seriously, I don't know where I'd be without VOCR. I use it all the time for my 3D printing and all the myriad of other apps that aren't accessible. Thank you so much for your work! Apple needs to hire you and put you in charge, and give you a gold hat.

Thanks for the kind words! I…

Thanks for the kind words! I'm glad that you find it useful!
This year I added bunch of features to VOCR, so if you haven't tried the beta version, you should try it! :)
It has new menu, real time ocr, object detection, AI image description through ollama, OpenAI, etc.
Honestly, VoiceOver should have these feature built-in, so I don't have to work on it.
If anyone has connection to Apple accessibility team, please tell them about it. lol

hope it can have new features that help us to play games

I hope in the near future it can help us to play games. or describe the menu and all that

massive newb question here…

massive newb question here. wanting to try this out and cant install the models! I put VOLama in the applications folder, opened teminal and typed ollama pull llama3, and it says command not found. feeling a bit silly here haha, help!

You have to install Ollama…

You have to install Ollama first before you can use VOLlama.

Yes, I'm all over the 2.0…

Yes, I'm all over the 2.0 beta. It's very slick. If you are happy for me to do so, I'll drop apple accessibility a line, it is exactly what they should be doing, trouble is, anything like this is partially admitting that it's needed, IE, their accessibility framework has gaping holes.

Any way you know of getting…

Any way you know of getting the higher quality siri voices on this? I'm not sure if there is a limitation of 3rd party apps accessing the built in voices.

@Chi

Thanks for clearing those things up. For some reason I never thought to use VOCR to read screenshots, but presumably I could ask it to take a grab and then interact with it. I think there are some options to view images in Finder, but mostly on my Mac they would be images in emails or Jira tickets so it's a little easier if I don't have to save it somewhere first. I'll give that a try next time, thanks. I know there is also AI integration with VOCR which I've still not really played with yet (but should do).

VOCR is one of those apps that I don't use all that often, but when I do it's such a godsend. So thanks very much for both apps.

mr grieves, with new Beta…

mr grieves, with new Beta VOCR, you don't need to save the screenshot into file. You can just do it from browser. Move your VOCursor to the target image and run OCR on VOCursor with control+command+shift+v.
People are doing all kinds of things with this: like extract text from video, asking what's going on in Youtube video, etc.
with real time OCR, you can even read live caption in real time without scanning over and over.

Ollie, Unfortunatley I haven't a way to access higher quality voices in Python. Also feel free to give Apple accessibility team a line. They already have screen recognition on iOS, so they just need to port it over to OSX!

Search

VOLlama v0.1.1, an open-source, accessible chat client for OLlama

Options

Comments

This is very cool! Thank you…

strange openAI model behavior

disregard my last comment

Haven't tried vollama but ollama

cool.

How do we get the higher…

I'm also having issues with…

Works with Windows

That's got it, thank you.

Describing images

For Image Description

Wait! You're the one who…

Thanks for the kind words! I…

hope it can have new features that help us to play games

massive newb question here…

You have to install Ollama…

Yes, I'm all over the 2.0…

Any way you know of getting…

@Chi

mr grieves, with new Beta…

@Chi

aaaaaa yep. definitely a…

any plans for this to come to the iPhone

how can I get VOLLama

AMD computer in 2021

I see that this is made in…

Image generation

For image description,…

No image generation. Sorry,…