Knowledge bases that get smarter while you sleep

Every RAG system has the same problem: the LLM rediscovers knowledge from scratch on every question.

You upload documents. The system chunks them, embeds them, retrieves the top-K fragments. The LLM stitches together an answer. Then you ask a follow-up, and it does the entire thing again — no memory of the synthesis it just performed, no accumulation of understanding across queries.

Ask something that requires connecting three different documents, and the LLM has to find and reassemble the fragments every time. The cross-references aren’t there. The contradictions haven’t been flagged. The synthesis doesn’t persist. Nothing compounds.

Andrej Karpathy put it well in his LLM Knowledge Bases gist: the problem isn’t retrieval — it’s that there’s no persistent artifact being built. RAG is retrieval without accumulation.

The persistent wiki

The fix is straightforward: instead of retrieving from raw documents at query time, have the LLM build and maintain a wiki — a structured, interlinked collection of markdown pages that sits between you and your sources.

When you add a new source, the LLM doesn’t just index it for later retrieval. It reads the source, extracts key information, and integrates it into the existing wiki. It updates entity pages, revises concept summaries, notes contradictions, and strengthens the evolving synthesis.

Three layers:

Raw sources — your indexed data. Repos, docs, PDFs, Slack, Notion, iMessage, Google Drive. Immutable. The LLM reads from these but never modifies them.
The wiki — LLM-generated markdown pages. Concepts, entities, cross-references, timelines. The LLM owns this layer. It creates pages, updates them when sources change, and maintains the index.
The schema — a schema.md file that defines the wiki’s conventions. You and the LLM co-evolve it over time.

This is Nia Vault. You point it at any combination of your indexed sources, and it compiles a personal wiki that compounds knowledge instead of rediscovering it.

Compiled truth + timeline

Gary Tan (founder of Y Combinator) built GBrain, a personal knowledge brain with an architecture detail we adopted: every wiki page has two sections.

Compiled truth (above the --- separator): your current best understanding. Written as present-tense, authoritative prose. This section gets rewritten when new evidence changes the picture.

Timeline (below the ---): an append-only evidence trail. Never edited, only added to.

# WebSocket Authentication

WebSocket auth uses JWT tokens validated during the handshake
upgrade. The token is verified by middleware before the connection
is promoted. Tokens expire after 24h but connections persist via
a refresh mechanism.

---

## Timeline
- **2026-04-08** | JWT validation found in ws_middleware.py [Source: nia-app repo]
- **2026-04-05** | Team discussed refresh tokens in Slack [Source: Slack #eng-backend]
- **2026-04-01** | Initial implementation committed [Source: nia-app, commit abc123]

## Sources
- nia-app repository
- Slack #eng-backend

The compiled truth is the answer. The timeline is the proof. When a source gets re-indexed, sync rewrites the compiled truth to reflect the latest state while preserving every historical entry in the timeline. You can see how knowledge evolved.

Typed wikilinks

Standard [[wikilinks]] connect pages. But not all connections are equal. A person working at a company is a different relationship than a library that extends a protocol. Vault supports typed wikilinks:

[[Stripe|works_at]]
[[React|uses]]
[[OAuth 2.0|extends]]
[[Old Claim|contradicts]]

The graph visualization color-codes edges by relationship type. The same data powers queries like “show me everything that uses React” or “find all contradicts relationships in the vault.”

The dream cycle

This is where it gets interesting. GBrain’s core idea: the brain should get smarter while you sleep.

Standard vault workflows are reactive. Ingest processes new sources. Sync regenerates stale pages. Lint finds broken links. These respond to changes.

The dream cycle is proactive. It runs on a schedule (weekly by default) and asks: what’s the vault missing?

Step 1: Entity extraction. Claude reads all existing pages and identifies people, companies, tools, papers, and concepts that are mentioned across multiple pages but don’t have their own page yet.

Step 2: Entity enrichment. For the top candidates, Claude synthesizes a new page by gathering context from every page that mentions the entity. The new page is fully cross-linked.

Step 3: Connection discovery. Claude looks for non-obvious connections between pages from different sources — patterns, shared concepts, contradictions that emerge when you view the full wiki at once.

Step 4: Contradiction detection. Related pages are grouped and checked for factual conflicts.

The output is dream-report.md:

## Entities Discovered (14)
- **Cursor** (entity) — [created] Mentioned in 3 pages, no dedicated page
- **Y Combinator** (entity) — [created] Key context for 5+ pages
- **Zod** (entity) — [created] Referenced across codebase and docs

## Connections (3)
- **Agent Auth × Collison Installation** — both describe zero-friction
  onboarding but from different angles (security vs growth)

## Contradictions (1)
- **billing-portal.md** vs **billing-cycle-management.md** — conflicting
  claims about trial period handling

We ran the first dream cycle on a vault with 443 pages across 26 sources (repos, Slack, Notion, iMessage, X posts, Google Drive). It discovered 14 missing entities, created 10 new pages, found 3 cross-source connections, and flagged 1 contradiction. The vault was meaningfully smarter the next morning.

What it ingests

Vault isn’t limited to code and docs. It indexes everything Nia indexes:

Source	What’s indexed
GitHub repos	Code, READMEs, issues, PRs
Documentation sites	Full page content, OpenAPI specs
Slack	Messages, threads, channels
Google Drive	Docs, sheets, slides
Notion	Pages, databases
iMessage	Conversations (E2E encrypted)
Apple Notes	Notes with folder structure
PDFs	Papers, contracts, research
40+ cloud connectors	Linear, Salesforce, HubSpot, etc.

The power is in the combination. A dream cycle that can connect a Slack conversation about auth changes with the actual code change in a GitHub repo with the design doc in Notion — that’s the synthesis RAG can’t do.

Five workflows

nia vault ingest <id>     # Process new sources into wiki pages
nia vault sync <id>       # Regenerate stale pages (skips your edits)
nia vault lint <id>       # Find orphans, broken links, contradictions
nia vault dream <id>      # Self-improving cycle: entities + connections
nia vault refresh <id>    # Ingest + sync in one pass (daily cron)

Ingest and sync are cheap by default — they skip sources that already have pages and only regenerate pages where the underlying source actually changed. The daily auto-refresh cron runs refresh mode; you never think about it.

Dream is the expensive one. It reads the whole vault and calls Claude multiple times. That’s why it’s opt-in and weekly, not daily.

Try it

# 1. Set up Nia (creates account, API key, CLI skills)
npx nia-wizard@latest

# 2. Connect sources at https://app.trynia.ai/settings/integrations

# 3. Create a vault
nia vault init "My Life" --from-source <source-id-1>,<source-id-2>

# 4. Run the dream cycle
nia vault dream <vault-id>

# 5. Browse the result
nia vault open <vault-id>

Or tell your agent: “Index my entire life — connect all my Slack, Drive, Notion, and create a vault with everything.” It understands the commands.

The vault is also browsable in the web UI at app.trynia.ai/vaults — page tree, force-directed graph, search palette, rich editor.

Docs: docs.trynia.ai/vault

Arlan Rakhmetzhanov, CEO @ Nozomio Labs · @arlanr