My goal was to build a local database that could:
- Ingest my ~4GB Bugzilla database
- Answer questions or give advice on new bugs based on historical ones
- Run offline on my openSUSE Tumbleweed machine, which is equipped with 64GB RAM and an AMD Ryzen 7 PRO 7840U
Naturally, my first idea was to build a standalone LLM like GPT. But fine-tuning an LLM on custom data is resource-intensive—a massive understatement. When I started to fine-tune an LLM on my laptop, I let the process run for a full week, and it reached only 1%. Using cloud-based services or investing in powerful new hardware were not options. Also, the problem with standalone LLMs is that they may hallucinate or generate inaccurate information, especially on domain-specific topics. The other disadvantage of using LLMs is that they are static; once trained, they don’t know anything that happened afterward.
[Read More]