Let Your AI Build a "Living" Knowledge Base

You know that feeling: you come across a great article online and bookmark it;

you read a valuable tweet and screenshot it;

you encounter an important concept and jot it down in your notes.

And — nothing ever comes of it.

Fragmented information keeps piling up, but very little of it ever becomes truly useful. What we lack isn't in— it’s a system that lets information grow on its own .

What Traditional Note-Taking Can't Do

Most people take notes in one of these ways:

  • Folder sorting : Stuff notes into folders. The problem is, a single note often spans multiple topic— where does it go?
  • Bookmark graveyard : 80% of browser bookmarks are never opened again.
  • RAG (Retrieval-Augmented Generation) Q&A : RAG simply means splitting long text into chunks, storing them in a vector database, and having the model retrieve the closest chunks to answer questions. With the rise of AI, people just feed files to the AI and ask questions. But — no accumulation . Every — you ask the same question, the AI has to re-read everything from scratch.

Put simply, these approaches are all about storing , not building . Knowledge needs to be compiled, linked, and continuously maintained before it becomes a real knowledge base.

The Core Idea of llm-wiki: Digest Once, Accumulate Forever

The system's logic is straightforward: You provide a source (a link, an article, a file), the AI digests it into structured knowledge and stores it in an interconnected wiki. When new knowledge comes in next time, it doesn't start from a blank — it merges into the existing knowledge network.

Think of it as a knowledge version of — each new article is a new box of bricks. The AI doesn't just put the whole box on the shelf; it opens it up and snaps the useful pieces onto the structures already standing.

The overall flow is simple:

Source → AI LLM Extracts Key Concepts → Create/Update Pages → Link Related Concepts → Store in Wiki

The next time you ask about the same topic, the AI doesn't re-read the original articles it directly queries the compiled wiki pages. That's "compile once, reuse forever."

How to Use It?

Step 1: Initialize

Create a blank knowledge base with a topic, direction, and an empty homepage.

Step 2: Add Sources

Add sources to your knowledge base — send text, images, PDF files, etc., to the LLM. The model parses them directly or calls vision tools to process files.

Step 3: Digest

The full digestion process consists of two steps:

— Structural Analysis : After reading the source, the AI extracts:

  • What this article is about (one-sentence summary)
  • The core concepts involved (entities)
  • Which topic it belongs to
  • Relationships between concepts
  • Any contradictory information
  • What the source explicitly states (EXTRACTED), what is inferred (INFERRED), and what the AI supplements from background knowledge (UNVERIFIED)

— Page Generation : Based on the analysis, the AI updates the wiki:

  • Create a source summary page (e.g., wiki/sources/2026-05-05-what-is-an-Agent.md )
  • Create or update entity pages for each core concept (e.g., wiki/entities/AI-Agent.md )
  • Create or update topic pages
  • Append new entries to the index.md home page and log.md journal
  • Use [[bidirectional links]] to cross-reference related pages for instance, what-is-an-Agent.md links to AI-Agent.md

The key point: if a concept already has a page, the new content appends to it rather than overwriting it. This way, perspectives on the same concept from different sources naturally converge.

What Makes It Smarter Than RAG?

RAG (Retrieval-Augmented Generation) is what many AI Q&A tools use today — you ask a question, the AI searches the original text, reads it on the spot, and answers. The upside is convenience; the downside is that it reads from scratch every single time — no memory .

llm-wiki works more like a diligent reader taking careful notes:

RAG llm-wiki
Reading Re-reads every time Digests once
Notes None Structured wiki
Knowledge growth None Continuous accumulation
Cross-source connections Via prompt Explicit links
Speed Gets slower with use Gets faster with use

The key is "compile once" — like writing code. Source code that gets interpreted every time (RAG) is naturally slower; compiled into binaries (wiki), lookups become lightning fast.

A Few Neat Details

Cache Mechanism

You don't feed the same article in twice. llm-wiki has a cache that checks before processing: have I handled this file before? Has the content changed?

  • No → skip it, no wasted AI compute
  • Content → re-process, update linked pages
  • Page → process it fresh

Health Check (Lint)

Over time, a knowledge base can develop issues:

  • Orphan pages : An entity page that no other page links to (an island)
  • Broken links : [[some concept]] pointing to a page that doesn't exist
  • Contradictory information : Source A says one thing, source B says another on the same topic
  • Confidence anomalies : Content marked EXTRACTED (explicitly stated) that can't actually be found in the original source

The AI can run periodic health checks, listing issues with suggestions for fixes.

Who Is This For?

  • Researchers : Reading papers, organizing references, mapping out a field
  • Writers : Gathering material, building conceptual frameworks
  • Product managers : Competitive analysis, organizing user feedback
  • Lifelong learners : Anyone who wants to actually use what they've read

A Final Thought

We live in an age of information overload. If all we do with the endless stream of content is bookmark, forward, and screenshot it without thinking, it will never become knowledge.

What llm-wiki does isn't complicated — give the AI a container you control, and let it assemble the fragments into a map. Don't measure yourself by the number of notes you have. Measure yourself by whether your knowledge base is growing and connecting.

The best part: all files are local markdown, plain text, ready to open with Obsidian, VS Code, or even Notepad. Your knowledge truly belongs to you, with no dependence on any cloud service.

And of course, this system itself is open source (MIT license). You can find the source code here: https://clawhub.ai

Build your own knowledge base with existing agent tools.