My goal was to build a local database that could:
- Ingest my ~4GB Bugzilla database
- Answer questions or give advice on new bugs based on historical ones
- Run offline on my openSUSE Tumbleweed machine, which is equipped with 64GB RAM and an AMD Ryzen 7 PRO 7840U
Naturally, my first idea was to build a standalone LLM like GPT. But fine-tuning an LLM on custom data is resource-intensive—a massive understatement. When I started to fine-tune an LLM on my laptop, I let the process run for a full week, and it reached only 1%. Using cloud-based services or investing in powerful new hardware were not options. Also, the problem with standalone LLMs is that they may hallucinate or generate inaccurate information, especially on domain-specific topics. The other disadvantage of using LLMs is that they are static; once trained, they don’t know anything that happened afterward.
[Read More]
I knew something was in the air when my favorite fishmonger, an incredibly friendly and always super-helpful fellow, asked me what I thought about that Chat.GPT thingy. That doesn’t happen often. Like our neighbor, who was a fantastic craftsman, didn’t stop me in the ’90s to ask me what I thought about object-oriented programming. Neither did the ladies in our university’s canteen ask us how POSIX threads would impact software engineering. Sure, the significance of these examples is arguable. Still, my point is that a massive paradigm shift in our beloved profession rarely breaks through to the general public.