Self-hosting for the first time

Self-hosting for the first time
This website runs here!

I've had my PC for a few months now, and when I got it, I had planned to do something with it - after all, it has a pretty decent graphics card (a 4070) and a good amount of memory/storage. I finally decided to do something about that by hosting a couple of services, and along the way, learned to how to set up my own domain and server alongside it pretty quickly.

What can you do with a graphics card? Well, you can run an LLM (you don't even need a graphics card, but it certainly helps a lot!). After looking for a while, I stumbled across the amazing ollama, a lightweight server that lets you pull and run models quite quickly. I'd used this quite briefly on my Mac to run a few <10b models, so I thought it would be a breeze to get it set up on Windows.

I ran their installer for Windows, installed it, and all was well, until I realized I'd actually much rather use their Linux installation for WSL. After "uninstalling" it from Windows, a few commands I didn't really understand to stop it in PowerShell, and reinstalling it on WSL, all was well.

Issues

Yeah, all wasn't well. Over two months later, I understood that I hadn't really uninstalled it when seeing three separate sets of installed models that would occasionally pop up. To this day, I still don't understand what went wrong, but a clean install seemed to fix it. Moral of the story: save what you care about, and be prepared to do clean wipes.

Besides the random issues I had, it worked remarkably well. They also had a full Python SDK that I used in a few projects, including a Discord bot I wrote some while back. I'll continue to reach for ollama if I need local AI.

Of course, I'd also like more than just an API - I need to have a chat interface! That's when I stumbled upon Open WebUI, a fantastic project that has only gotten better in the past few months. With native ollama support, as well as support for cloud providers, it's great as a whole ChatGPT replacement. It's being actively maintained and has gotten an update since I started writing this post. Speaking of, Groq is pretty amazing - their prices and speed are the lowest of any provider I've checked so far, so I'd highly encourage checking them out if you aren't using local.

It was pretty easy to connect ollama to Open WebUI, though it did take me a few hours to give up on trying to only expose a port while connected to ollama - I just had it run on the host network instead.

After it all, I had a web app running on my machine at localhost:8080. Perfect! Now all I needed was for it to work on my own website. That's where Cloudflare Tunnels come in.

With a domain configured with DNS on Cloudflare, you can install their cloudflared CLI tool to manage tunnels. Cloudflare also lets you connect HTTP services and puts a certificate in front, so that they're all actually signed when it reaches the end user.

What you should actually do

Don't use cloudflared on your host - instead, you should run their docker image with a token provided through Cloudflare Zero Trust. This also lets you manage it from their Zero Trust Dashboard, so you can change the config a lot easier. I'm not entirely sure what Zero Trust is, but I use it just for tunnels and it works, so I'm not going to ask.

With all that set up, and the hours and hours of debugging an issue which shouldn't have been that hard to fix, I finally did it! You can check out this sweet login page here. No more ChatGPT for me - I'm a self-hosted guy now 😎

I'll be writing another post later about setting up my other self-hosted services with docker-compose (including Mastodon), so stay tuned!