ANTiRAG: Because Not Every Knowledge Base Needs a Vector Database
A simpler approach to project knowledge bases — and why simplicity is a design choice, not a compromise
We have a habit in software. When a hard problem shows up, we build infrastructure around it. A new layer. A new service. A new pipeline. It works. It becomes the standard. And then, much later, someone asks — did we actually need all of that?
I think RAG is at that point now. At least for a specific class of problem — not web-scale search, not enterprise data lakes, but something much more common: the project knowledge base. A few hundred Jira tickets. A product’s worth of documentation. A few gigabytes at most. The kind of knowledge base every team has, and nobody has quite figured out how to make an LLM reason over effectively.
That’s the problem ANTiRAG is trying to solve.
Why RAG Exists
RAG solved a real problem. LLMs are trained on data up to a certain date. They don’t know what’s in your internal wiki. They haven’t read your team’s Jira tickets. They weren’t in last week’s architecture meeting.
RAG was the fix: at query time, retrieve relevant documents and pass them to the model as context. It works. It’s widely used. No complaints there.
But over time, RAG became an ecosystem. Vector databases. Embedding models. Chunking pipelines. Sync jobs to keep the index fresh. Each piece was added to solve a real problem. And each piece added weight.
At some point it’s worth asking — are we solving the problem, or are we solving the infrastructure we built to solve the problem?
The Consultant and the Detective
Think about two different ways a professional can walk into a problem they don’t fully know yet.
A consultant comes prepared. Before the engagement, they’ve studied your industry, read about similar clients, built up a body of knowledge. When you ask a question, they pull from what they already processed. Fast. Usually good. But the answer is shaped by how they organised their knowledge beforehand.
A detective doesn’t work that way. They arrive, look around, search for clues, follow one thread to the next, and build their understanding from what they actually find. The answer comes from the search itself.
RAG is the consultant. Your documents are pre-processed into embeddings, clustered in vector space, ready to be retrieved before you’ve even asked your question.
ANTiRAG is the detective. The documents stay as plain text files. When you ask something, an agentic LLM searches through them directly — reads, follows links, builds context — and answers from what it finds.
The Chunking Problem
RAG doesn’t store full documents. It splits them into chunks — paragraphs, sentences, fixed token windows — and embeds each chunk separately. The assumption is that a chunk is the unit of meaning.
But sometimes meaning doesn’t live in a chunk. It lives in the connection between things.
Think about Bahubali. If you’ve only seen the second film, you know that Kattappa killed Bahubali. But you can’t explain the full weight of why — because that context is in the first film. The answer spans both.
Knowledge bases work the same way. A Jira ticket describing a bug fix only makes complete sense alongside the architectural decision from three tickets earlier. When everything is chunked and embedded independently, you can get the right fragment and still miss the point.
ANTiRAG doesn’t chunk anything. A document is a document. The agent reads what it finds, and if one file points to another, it follows that link — the same way you’d trace an import in a codebase, or follow a Wikipedia link deeper into a topic.
Grep Is Not a Step Backwards
When a developer lands in an unfamiliar codebase, they don’t embed it into a vector database. They grep. They search for a function name, find its definition, trace where it’s called, read the surrounding code. Understanding builds through traversal.
That’s not primitive. It’s well-matched to how structured text works.
The ANTiRAG knowledge base is plain markdown files. One file per Jira ticket. YAML frontmatter for metadata. Auto-extracted concept backlinks connecting related files. When you ask a question, the LLM searches the files, reads what’s relevant, follows the links, and builds an answer from the actual content.
No embeddings at ingestion. No vector store. No sync pipeline. Just text, structure, and an agent that knows how to read.
See It in Action
I built a working prototype — an importer that turns Jira tickets into markdown files, a concept graph that shows the backlinks between them, keyword search, and a chat interface where the agent traverses the knowledge base at query time.
No vector database. No embedding pipeline. It runs on localhost and it works.
The Point
ANTiRAG isn’t an argument against RAG. It’s an argument for asking the question before reaching for the full stack.
For large corpora, RAG makes sense. But for a project knowledge base — a few hundred documents, a few gigabytes, a team that just wants their LLM to actually understand their tickets — the full pipeline may be more complexity than the problem needs.
Simplicity is not a fallback. It’s a design choice. And like any good design choice, it starts with a simple question — what does this problem actually need?
Sometimes the answer is a vector database. Sometimes it’s just well-organised text and an agent that searches like a developer would.