Building a Local Bugzilla RAG System

A guide to building a local RAG system for Bugzilla data using Ollama and ChromaDB.

My goal was to build a local database that could:

  • Ingest my ~4GB Bugzilla database
  • Answer questions or give advice on new bugs based on historical ones
  • Run offline on my openSUSE Tumbleweed machine, which is equipped with 64GB RAM and an AMD Ryzen 7 PRO 7840U

Naturally, my first idea was to build a standalone LLM like GPT. But fine-tuning an LLM on custom data is resource-intensive—a massive understatement. When I started to fine-tune an LLM on my laptop, I let the process run for a full week, and it reached only 1%. Using cloud-based services or investing in powerful new hardware were not options. Also, the problem with standalone LLMs is that they may hallucinate or generate inaccurate information, especially on domain-specific topics. The other disadvantage of using LLMs is that they are static; once trained, they don’t know anything that happened afterward.

[Read More]

Cross build and packaging

It compiles! Ship it!

Introduction

Let’s start by clarifying what we mean by cross-building and cross-packaging. Cross-compilation is the process of compiling source code on one platform, called the host, in order to generate an executable binary for a different target platform. The emphasis here is on the word “different”. The target platform may have a different CPU architecture, such as when we work on an x86 computer and want to build software for a Raspberry Pi board with an ARM CPU. But even if the target platform has the same CPU architecture as the host, there may be several other possible differences. For example, the host may be running Debian Sid, while the target may be running openSUSE Leap. Different Linux distributions may have different compilers, linkers, and run-time libraries. Even when using the same distribution as the host for the target, they may be different releases, such as openSUSE Tumbleweed and Leap. In short, nothing guarantees that the target system will have the same shared libraries as the host system.

[Read More]

Reverse dependencies

Dependencies are like pets, they bring joy but also require constant attention.

As start let’s sort it out what is dependency and what is reverse dependency.

Dependencies and reverse dependencies in Linux distributions are important concepts to understand. A package dependency means that another package relies on it in order to function. For example, if package B requires package A to be installed in order to work, then package B is dependent on package A and is considered a reverse dependency of package A.

[Read More]

Build system statistics

Okey, let's start with a boring cliché...

From time to time we should ask ourselves how are we doing. Are we successful, are we on the right track, are we heading to the right direction, are we fast enough, are we accelerating or slowing down?

This time I am talking about the openSUSE Linux Distribution and about the SUSE Linux Enterprise Server.

And here I quickly would like to note an important disclaimer with a short story.

[Read More]

Checking changelogs with zypper

The future you see is the future you get.

I have heard way to often the question from Linux and specially SUSE Linux users that “How can I check the changelog of a package or new version of a package available on the repository, but not yet installed”.

There was no easy answer for that question, so I have decided to make a little tool for that.

How it is done

All the enabled repositories have a bunch of configuration files in a well structured directory tree under the /var/cache/zypp/raw/.

[Read More]

Data visualization with Grafana and Telegraf

There are decades where nothing happens; and there are weeks where decades happen.

It all started when…

Few weeks ago we have decided to create a dashboard where we can monitor the status of the SUSE Linux Enterprise maintenance update queue. Naturally there are tons of cool open source solutions to build this type of monitoring. Two decades ago I probably would have written a Perl or Python based monitoring script for the monitoring part and use the good old gnuplot (http://www.gnuplot.info/) to visualize the data and create an active page written in some silly web UI framework. Let’s just say that luckily those times has passed.

[Read More]

Contributing to SLE/openSUSE

What is the path of an upstream fix to a given codestream

The motivation of this post is to demonstrate how easy and logical is the workflow of an upstream change in a project to a given SUSE Linux codestream. I try to write this post in a codestream agnostic way. As I have experienced the workflow from the package maintainer point of view is the same for SUSE:SLE-15:Update and for openSUSE:Factory.

What I want to do

It all starts with a Bugzilla case. For the sake of this exercise I will walk through the process with this bug report: https://bugzilla.suse.com/show_bug.cgi?id=1195126 I use this case because it was a fairly simple, straight forward issue. It is a CVE-2022-0351: vim: uncontrolled recursion in eval7(). This is a Common Vulnerabilities and Exposures (CVE) what means that somebody has found and published an information-security vulnerabilities and exposures. By classification it is an important issue and as a package maintainer it is not my role to re-evaluate if the issue represents serious threat or not. My goal is to figure out if I can reproduce the issue and if I can find a fix for it.

[Read More]

Playing with Shelly

Finally I can turn on the lights from CLI

For xmass I got few Shelly lamps to play with. Shelly lamps are simple IoT devices. Super easy to install, configure and use. The Youtube is full with instructions on what can be done with these smart lamps. Naturally my main motivation was to figure out how to hack these devices and how ready my openSUSE servers are with tools and services (spoiler: they are ready)

Look daddy no cloud

Needless to say that like most smart home automation devices the Shelly lamps can be operated via the Shelly cloud. I may cover that area in the next post. But now I am interested in what can be done without the cloud. After all, one big selling point of the Shelly devices is that they are fully operable and functional even without Internet connection just on a WiFi LAN. It means that if I am concerned about the security of my home infrastructure I have an option not to expose my smart devices.

[Read More]

Measuring web traffic with Matomo

You get what you measure

Matomo is an open source PHP/MySQL  based web analytics application to track online visits to websites and displays reports on these visits. It does what Google Analytics does, but it is open source. Matomo has commercial cloud based offering for those who do not want to host their own instance but the code is there on GitHub (https://github.com/matomo-org/matomo) for anyone who is interested.

I decided to first test drive the cloud based solution and then install my own instance. 

[Read More]

Private cloud based on openSUSE Leap 15.3 beta and Nextcloud

Be yourself. Unless you can be a unicorn, In that case, you should always be a unicorn.

Motivation

I used to have a Synology DS414 server what worked  well for about 8 years. Naturally, occasionally I had to change disks in the RAID5 system in it, but other than that it did its job. But regardless of the really smooth user experience and the low maintenance needs I never really liked that system as the Synology Disk Station Manager OS is not like many “real” Linux distributions and the community behind that OS is basically non existent. And to be honest I do not really feel that Synology is very eager to build and maintain a community around their OS. It looks more like that they just barely comply with the GPL. All in all, I had just enough motivation to migrate my private cloud and NAS to a proper OS.

[Read More]

Setup a Blog With Github Pages and Hugo

Always be yourself. Except if you can be Batman, then be Batman!

GitHub pages are super powerful and very easy to use for creating markdown based static websites.

In this post I will walk through how I made this very page.

My setup will be two GitHub repositories, one for the source of the page (https://github.com/bzoltan1/blog-source) and the other where the html artifacts are deployed (https://github.com/bzoltan1/bzoltan1.github.io)

Here I would like to note that it is possible to use a single repository with two branches, one for holding the the source and the other where the website is deployed. I just personally find the two repository setup more elegant without any particular reason.

[Read More]

Telegram Bridge

Who reads system logs anyway?

Motivation

I got lucky with my original hackweek project and I have managed to set up my Leap 15.3 based NAS and private cloud running on NextCloud earlier than planned.

So I though that as an extra project I will set up a proper system monitoring service. The monit service is very handy (thanks for the idea to Paolo Stivanin) but by default it wants to send emails when something goes wrong. Instead of emails I would prefer a real instant message. I am using mostly Telegram for personal purposes. Sure I am using Signal, Matrix, Slack and Rocket.Chat too and technically I have WhatsApp account too. But I decided to start with Telegram.

[Read More]