Navigating the AI Jungle - Chat Bots
We are overrun by AI tools and services today. A simple guide
There are so many AI-powered applications today that you will quickly get lost. You can easily spend the whole day trying out different tools getting nothing done.
I figured I would do my part in helping you out navigating the AI jungle by sharing my experience with different AI tools and services. I will tell you which ones I like and which ones I like less. This is personal experience and not some kind of corporate summary by looking the webpage blurbs.
Chat Bots
Large language models (LLMs) such as ChatGPT is the one that started the AI hype. Let us compare the alternatives.
ChatGPT
I have tried various alternatives, but I like to say I feel OpenAI makes an overall solid product. It is not just that their model gives good answers, but they also have a polished UI.
For instance, while DeepSeek works really well and is cheaper, what keeps me with OpenAI is the voice chat. I really love being able to just walk around and talk to the AI. When I take walks outside, I might simply put on my headphones and have a conversation with ChatGPT to learn about new things. Learning by asking questions works very well for me.
There are also many small details that work well, such as automatically giving a sensible title to each of your sessions for easy retrieval later.
And now they have a special view for working with code. Again, very well-thought-out UI. In my opinion, so much of the benefits you will gain from AI services will be down to how well done their UI is rather than the underlying engine.
OpenAI GPT
OpenAI, creator of ChatGPT, has alternative ways to use their AI beyond the ChatGPT interface most of you are familiar with. While ChatGPT itself is based on monthly fees, using the API you pay per request. While this interface is more meant for developers than end users, it actually is user-friendly enough that I would guess most of you reading my articles will have no problem using it.
There is a little more setup here in that you need to create one of more projects and API keys. Projects are just a way to keep track of spending. Each generated API key is associated with a project. Any requests made using that API key will be billed to the associated project.
To give a concrete example of how this might work. My iTerm2 terminal program on Mac offers AI integration. For it to work you need to provide it with an OpenAI API key. I made "Terminal Apps" Project for all my Unix terminal related API keys. So whenever I use a terminal application with AI, that usage will be billed to my "Terminal Apps" project.
So the primary usage of this interface is for people who want to use OpenAI with other applications. But you also have access to features not found in regular ChatGPT such as the ability to do voice synthesis or using models that fetch data from the internet in realtime.
DeepSeek
DeepSeek caused a lot of stir with the release of their R1 reasoning model as it could match OpenAI's o1 model while being much cheaper in terms of processing power used.
I want to cover DeepSeek more in a separate article as there is too much to talk about. The main attractive thing with DeepSeek is that you can run it locally on your own computer. To be specific, not the full-blown version, but smaller distilled models. When large AI models are made, one often make a series of reduced size models. These consume less VRAM on your graphics card, but are not as "intelligent".
ApX has an overview of system requirements for different R1 models, which you can read. These local models will not be quite as good as running ChatGPT in the cloud, but they surprisingly good for their size.
My first mistake in using DeepSeek was thinking that the reasoning model R1 is good for everything. If you are writing code it is better to use their code model for instance. And if you aren't trying to solve mathematical and scientific problems then the V2 or V3 models will work better.
To install and run DeepSeek you typically use a software called ollama.
Ollama
Ollama is a great tool for simply downloading and running AI models. It is not limited to download and running DeepSeek models. You can use it to download other models as well.
For instance to download the DeepSeek model suited for writing code I write:
> ollama pull deepseek-coder:6.7b
Of course you need to have ollama installed first. Here you can see a list over the DeepSeek models I downloaded and used.
> ollama list
NAME ID SIZE MODIFIED
deepseek-v2:16b 7c8c332f2df7 8.9 GB 36 hours ago
deepseek-coder:6.7b ce298d984115 3.8 GB 37 hours ago
deepseek-r1:7b 0a8c26691023 4.7 GB 44 hours ago
deepseek-r1:1.5b a42b25d8c10a 1.1 GB 45 hours ago
There 1.5B means 1.5 billion parameters in the model. That is the smallest model and I found it rather useless. The 7B model is quite useful however. Notice I use a V2 model instead of a V3 model. That is because I couldn't find V3 models small enough to fit into my 12GB of VRAM.
What I loved about my ollama experience is that I did not need to have any kind of GUI access. The Ubuntu 22.04 Linux box I run DeepSeek on is headless. I use ssh
from my MacMini to communicate with it.
You can run a DeepSeek model from the command line like this:
> ollama run deepseek-r1:7b
However, that gives a rather primitive CLI interface. Fortunately, there is a very fancy Web interface called Open WebUI which works with ollama which you can use.
Open WebUI
Open WebUI was actually a bit of a hassle for me to install because it is a Python application which requires Python 11 and my Ubuntu 22.04 run Python 10 as default.
Fortunately, ChatGPT was quite good at guiding me through that process, even though it screwed up a number of times.
My steps for getting Open WebUI to run on my Ubuntu 22.04 was something like this:
> sudo apt install python3.11
> sudo apt install python3.11-distutils
> curl -sS https://bootstrap.pypa.io/get-pip.py | python3.11
> pip3 install open-webui
> open-webui serve --disable-auth
The --disable-auth
is just for disabling authentication, as you run the Web app as a single user.
Open WebUI automatically finds ollama and runs the default model. There is no setup required on your part. It gives you an interface very similar to ChatGPT. It runs as a web server. So I was able to connect to run DeepSeek in a browser on my Mac, that simply served requests to the web server running on my Ubuntu 22.04 box.
There can be some hassle running a server using HTTP as opposed to HTTPS. To save you hassle, you might want to run tunneling.
Google Gemini
Gemini is an alternative to OpenAI's ChatGPT. In my experience, it is quite good. But I didn't find it interesting enough to use for a simple reason: It does not have as polished and developed UI as ChatGPT. You don't have a nice voice chat. A number of other features seem to be missing, such as the ability to upload a variety of file types.
Perplexity
Perplexity is not an AI engine but an app which used other AI engines such as GPT, Gemini, DeepSeek or similar. Its focus in providing answers with sources. E.g. giving you what web page a particular piece of information was found to avoid the problem with hallucinations.
However I see some issues with Perplexity. ChatGPT already has the ability to give you sources. DeepSeek can as well. Hence Perplexity does not give that much extra value. In fact its interface is less good than ChatGPT. Its voice interface is terrible for instance.
You should check out if Perplexity is for you, but I didn't find it all that useful.
AI Terminals
Remembering exact Unix commands can be tricky so there is an obvious advantage in being able to provide a natural language based interface to the Unix command-line.
iTerm2
I have used iTerm2 for many years and absolutely love it. My favorite feature is the ability to command click and error message and have a code editor automatically open and jump to relevant line. Very useful for those of us who are not big fans of IDEs but who prefer using a terminal and code editor for writing code.
Anyway it meant I was excited about seeing iTerm having added some AI support. However this requires signing up for OpenAI GPT API. Not a big deal with you need to setup a project, generate an API key and so on.
That would be fine if it gave you a lot of benefits but it doesn't. I found that iTerm AI capabilities relative to Warp was weak. My judgement is just don't bother.
Warp
Warp works on macOS and Linux. This isn't merely a standard console with some AI capabilities added. It is a complete rethink of how a Terminal application should work. I am not very experienced with it yet, but thus far I have very positive experience. Even without AI, Warp seems like the future.
It does lack some polish though. I use fish shell and that caused me a lot of hassle. Logging into their service didn't work. I could not even see my console window. On the other hand I am used to fish often causing trouble. It is a very non-standard shell so it often requires special handling. Once I switched to zsh my problems went away.
Many of features I love about fish is already built into Warp, so it wasn't much of problem switching back to zsh.
So despite my initial hassle, I would say Warp is a no brainer. It is a really beautifully designed, user friendly, fast and responsive terminal app that blends in the AI features naturally.
I intended to show AI apps and services across multiple domains: audio, video, text and images but found that made the article too large. So stay tune for the next one covering audio.
Navigating the AI Jungle – Audio
We are surrounded by AI tools and services today, and the audio space is no exception. There are various AI-powered tools available for transcribing speech into text, synthesizing audio from text, and even generating sound effects based on descriptions. Just as you can ask an AI to create an image from a text prompt, you can now describe a sound, and AI will generate it.
I use Chat GPT a lot and have a $20 per month subscription. A lot of what I use it for is brainstorming, followed by asking for a coding example of how to do something. Usually this means utilizing something in the standard library.
I don't rely on Chat GPT too much, just to get me started and avoiding Google searches. As we all have to stay sharp!
Anyway, I'm curious to know what the cost of the API interfaces is relative to my $20 per month subscription. $20 per month seems so cheap.