The Idea
After building RAG systems for Bugzilla and Redmine data, the next obvious candidate was sitting right there on every Linux machine: the system journal. Instead of grepping through journalctl output or staring at walls of log text, why not just ask plain English questions like “What went wrong last night?” and get a useful answer?
The constraints were the same as always: fully local, no cloud services, no API costs, runs on my openSUSE machines.
The Stack
Nothing exotic here, the same stack that worked before:
- journalctl to fetch logs from systemd journal
- ChromaDB as local vector database for storing and searching log chunks
- nomic-embed-text via Ollama for generating embeddings
- qwen2.5:7b via Ollama for interpreting the retrieved log context
How It Works
The tool has two responsibilities: indexing and querying. These started as two separate scripts and eventually got merged into a single CLI tool logai.py with subcommands.
Indexing fetches new journal entries since the last run, splits them into small overlapping chunks, embeds each chunk, and stores it in ChromaDB along with a Unix timestamp. A simple state file tracks the last indexed timestamp so only new entries are fetched on each run.
$ python3 logai.py index
Loaded 6113 log lines since the beginning
Created 1232 chunks for embedding
Indexed 1229 chunks
Indexing complete
State saved. Next run will start from 2026-03-16T07:30:39+02:00
The incremental indexing works well in practice:
$ python3 logai.py index
Loaded 155 log lines since 2026-03-15T18:46:41+02:00
Created 31 chunks for embedding
Indexed 31 chunks
Indexing complete
Querying takes a natural language question, uses the LLM to extract any time reference from it (“yesterday”, “last 2 hours”, “this morning”), converts that to a Unix epoch range, and passes it as a pre-filter to ChromaDB before the semantic search runs. So “What errors happened last night?” only searches chunks from last night, not the entire history.
$ python3 logai.py query "What serious issues happened in the last 12 hours?"
Time filter applied: 2026-03-15T08:36:10 -> 2026-03-16T08:36:10
Retrieved log context:
[...]
Explanation:
[HIGH] wireplumber: failed to get status for PID - check if the process is still running
[HIGH] systemd: unit NetworkManager-wait-online.service timed out - review network config
No other high severity issues found.
A Few Things That Needed Fixing Along the Way
It was not entirely smooth sailing.
The most important architectural fix was the incremental indexing. The original design was deleting and recreating the ChromaDB collection on every run, which defeated the whole purpose of tracking the last timestamp. The fix was a simple get-or-create pattern: reuse the existing collection and just append the new chunks.
ChromaDB’s comparison operators ($gte, $lte) only accept numeric values, not strings. So the timestamps had to be stored and compared as Unix epoch integers, not ISO format strings. This one I did not expect.
journalctl occasionally injects marker lines starting with -- between real log entries, like -- Logs begin at .... These have no timestamp and crashed the embedding step. Filtering them out at read time was the obvious fix, but first I had to figure out what was happening, which took some head scratching.
The Result
The final tool is a single script with two subcommands. Questions with time references get automatically filtered. Questions without them search the full index.
Some questions that work well:
"What serious errors happened in the last 2 hours?""Did any services fail to start this morning?""My system felt slow yesterday afternoon, what was happening?""Were there any authentication failures today?"
The code is here: github.com/bzoltan1/logai
What About Making It an MCP Server?
At some point during development the idea came up: would it make sense to expose this as an MCP server? The indexer and the query tool map cleanly to two MCP tools, and in theory it would allow calling them from any MCP-compatible client.
After thinking it through, the answer was no, at least not for this use case. MCP makes most sense when multiple different clients need to share the same tools, or when you want to expose functionality to an AI assistant. Here the only client is a human sitting at a terminal. Wrapping two simple scripts in a server process, a protocol layer, and a client would add real complexity without adding any real value.
The better move was actually the opposite: instead of expanding outward into a server architecture, collapse inward and merge the two scripts into a single CLI tool with subcommands. Same result, less infrastructure, simpler code.
Not every tool needs to become a service. I have to remind myself this sometimes.
Conclusion
This turned out to be the most immediately useful of the three RAG experiments. Bugzilla and Redmine data is historical and relatively static. System logs are live, local, and personally relevant. The incremental indexing means the tool stays current with minimal overhead, and the time-aware querying makes the answers actually relevant instead of pulling in old unrelated entries.
The fundamental limitation is the same as always: the quality of the answers depends on what is in the logs. If a service fails silently without logging anything useful, the tool cannot help. But for the noisy and verbose output that systemd and most Linux services produce, it works surprisingly well.
The bigger point of this exercise is that building a working RAG system is not hard, and it does not require serious hardware. No GPU, no cloud subscription, no expensive infrastructure. An ordinary laptop or a home NAS server is perfectly enough. It is slow, yes. Indexing thousands of log lines takes time and queries are not instant. But it works, and it runs entirely under your own control.
If you are a Linux engineer or a developer reading this, I would encourage you to look around a little bit. You probably have more interesting data sitting on your systems than you think. Application logs, bug trackers, ticket systems, configuration histories, build logs - any of these can be ingested, embedded, and made queryable in plain English with the same simple stack described here. The barrier to entry is much lower than most people assume. The hardest part is usually just deciding to start.