Inventory: Memory Plane
Memory Plane Inventory — AI Factory Audit
Date: 2026-05-09
Auditor: Chip Huyen (AgentForge)
Scope: Read-only probe. No mutations.
Task: Plan Task 1.1 — Memory Plane Inventory
1. Per-Store Table
| Store | Endpoint / Path | Schema / Collections | Live Count | Write Path | Read Path | Owner Daemon | Status |
|---|---|---|---|---|---|---|---|
| mem0 / Qdrant | http://localhost:9000 (mem0 API) / http://localhost:6333 (Qdrant gRPC+HTTP) |
5 collections: mem0migrations (0 pts), sessions (929 pts), hivemind (60,442 pts), mem0_john (865 pts), knowledge (31,274 pts) |
93,510 total vectors | No caller found. mem0 API (POST /add) is NEVER called by any hook, tool, or daemon in ~/system/tools/ or ~/.claude/hooks/. hivemind.js dual-writes to Qdrant hivemind collection directly via internal HTTP (port 6333). |
No tool reads localhost:9000 for queries. hivemind.js semantic search reads Qdrant hivemind collection directly via qdrant-client. discover.js does NOT query mem0. |
com.alai.mem0-server (LaunchAgent, KeepAlive=true, PID 65706 alive, last exit was SIGTERM -15) |
HEALTHY (server alive, but ORPHANED — no producer writes to mem0_john or knowledge via the mem0 API) |
| Chroma | ~/.claude-mem/chroma/chroma.sqlite3 |
1 collection: cm__claude-mem |
6,584 embeddings | Unknown — no daemon or hook references claude-mem path in scanned tools. Likely written by a claude-mem MCP server or CLI tool directly. |
Unknown — no caller found in ~/system/tools/ or ~/.claude/hooks/. |
None identified | PARTIAL (data exists, producer and consumer both untraced) |
| LightRAG | http://localhost:9621 |
Neo4J graph + NanoVectorDB + JsonKV storage; workspace /app/data |
999 processed docs, 1 failed (pipeline_busy=true, 120 async locks pending — actively ingesting) | ~/.claude/hooks/lightrag-auto-ingest.sh (PostToolUse: Write/Edit) — fires on writes to ~/.claude/projects/-Users-makinja/memory/*.md, ~/system/specs/*.md, and /tmp/*-bookstack-*.md. Also com.alai.lightrag-outbox-ingest.plist daemon. |
discover.js — primary read path. Queries https://lightrag.alai.no/query (external hostname, not localhost). Fallback: if local hits < 3, LightRAG fallback fires. |
com.alai.lightrag-watchdog.plist, com.alai.lightrag-keepwarm.plist, com.alai.lightrag-backup.plist, com.john.lightrag-monitor.plist, com.alai.lightrag-migrate-pump.plist |
HEALTHY (serving, ingesting) |
| HiveDB (SQLite) | ~/system/agents/hivemind/hivemind.db |
7 tables: agents (139 rows), memos (100 rows), intel (17,551 rows), subscriptions (6 rows), _litestream_seq, _litestream_lock, sqlite_sequence |
17,551 intel rows (NOTE: context memo said 64,889 — live probe shows 17,551; delta likely from live deletions or memo was stale) | hivemind.js post <agent> <type> <message> — agents call this CLI to write intel. Also dual-writes embeddings to Qdrant hivemind collection (best-effort, fire-and-forget). |
hivemind.js read/query/search — text search + semantic search (cosine sim against local embeddings or Qdrant). discover.js does NOT query HiveDB directly. |
hivemind.js (stateless CLI, no daemon; called ad-hoc by agents) |
HEALTHY |
| .md auto-memory | ~/.claude/projects/-Users-makinja/memory/ |
123 .md files (MEMORY.md index + per-topic files + feedback memos + _archive/) |
123 files | Claude Code's built-in auto-memory system (native Claude Code feature — writes .md files after conversations automatically, not via any explicit hook or daemon). lightrag-auto-ingest.sh PostToolUse hook then ingests these into LightRAG when they are written/edited. |
CLAUDE.md "Context Loading" section instructs John to Read specific files directly. discover.js memory "<topic>" is documented as LightRAG-backed (reads LightRAG, not the .md files directly). |
Built-in Claude Code (no external daemon) | HEALTHY (write path functional; read path partially bypassed — LightRAG index only 999 docs, not all 123 .md files confirmed ingested) |
2. Producer → Consumer Matrix
| Producer | Store Written | Consumer | Notes |
|---|---|---|---|
| Claude Code built-in auto-memory | ~/.claude/projects/-Users-makinja/memory/*.md (123 files) |
lightrag-auto-ingest.sh hook (secondary producer → LightRAG) |
Auto-memory is Claude Code native. The .md write triggers the hook. |
lightrag-auto-ingest.sh (PostToolUse hook) |
LightRAG http://localhost:9621 |
discover.js (primary RAG consumer) |
Only fires on Write/Edit tool calls to in-scope paths. Does NOT write to mem0. |
com.alai.lightrag-outbox-ingest.plist daemon |
LightRAG | discover.js |
Batch ingest pipeline for outbox staging |
hivemind.js post (called by agent tools) |
HiveDB SQLite hivemind.db + Qdrant hivemind collection (dual-write) |
hivemind.js read/query/search (CLI) |
Qdrant hivemind = 60,442 vectors; SQLite intel = 17,551 rows — divergence suggests Qdrant has historical vectors beyond current SQLite rows (possibly from bulk migration) |
| NOBODY | mem0 API (localhost:9000/add) — mem0_john collection (865 pts), knowledge collection (31,274 pts) |
NOBODY reads via mem0 API either | WIRE BREAK: mem0_john has 865 facts that were presumably written at some point (possibly during initial mem0 setup / manual population), but no current tool, hook, daemon, or agent calls POST localhost:9000. The mem0 API is a running server with no active clients. |
| NOBODY identified | Chroma ~/.claude-mem/chroma/ (6,584 embeddings) |
NOBODY identified | Chroma has data (6,584 embeddings in cm__claude-mem) but producer and consumer are both untraced in current tooling. Likely written by a claude-mem MCP tool in a previous iteration. |
com.john.session-archiver.plist |
Likely sessions Qdrant collection (929 pts) |
discover.js --sessions (reads sessions SQLite, not Qdrant) |
Sessions exist in Qdrant but discover.js reads from a local SQLite sessions table, not via mem0 or Qdrant API |
rag-router.js learn |
~/system/databases/flywheel.db (SQLite: interactions + rag_cache) |
rag-router.js query (cache-hit path) |
Sixth store — flywheel SQLite, not listed in original inventory. Routes: cache → local Ollama → external. Does not touch mem0. |
3. SoR Gap Analysis — Duplicated Fact Classes
| Fact Class | Stores Containing It | Designated SoR | Derivative / Shadow | Gap / Conflict |
|---|---|---|---|---|
| Agent intel / decisions | HiveDB intel table (17,551 rows) + Qdrant hivemind collection (60,442 vectors) |
HiveDB SQLite (primary; hivemind.js writes here first) |
Qdrant hivemind (dual-write, best-effort) |
60,442 Qdrant vectors vs 17,551 SQLite rows = 3.4x divergence. Qdrant likely contains orphaned vectors from deleted/purged SQLite rows, or a bulk historical migration that wasn't reflected in SQLite. No reconciliation daemon exists. |
| Session summaries / history | Qdrant sessions (929 pts) + likely local session SQLite (referenced by discover.js) + .md memory files (MEMORY.md index) |
Undefined — no explicit SoR designation | All three are partial | discover.js --sessions reads SQLite, not Qdrant sessions. Who writes Qdrant sessions? Untraced. |
| John's personal facts / preferences | mem0 mem0_john collection (865 vectors) + .md auto-memory files (123 files) + LightRAG (999 docs, subset overlapping .md files) |
Intended SoR: mem0 (mem0_john) — but NO active writer. Actual SoR: .md files (Claude Code writes here). |
LightRAG is downstream derivative of .md files via lightrag-auto-ingest.sh |
Critical SoR conflict: 865 facts in mem0 are STALE (last written at setup, no ongoing writes). 123 .md files are current. LightRAG is a partial index of .md files. Three stores claim the same fact class with no reconciliation. |
| Knowledge base / operational docs | mem0 knowledge collection (31,274 vectors) + LightRAG (999 docs, BookStack exports) + Chroma (6,584 embeddings) |
Undefined | All three parallel | knowledge collection in mem0 has 31,274 vectors — largest in mem0, but again no active writer via mem0 API. Origin unknown. Chroma cm__claude-mem (6,584) is also an orphan with no identified current writer or reader. |
| HiveMind broadcast intel | HiveDB hivemind Qdrant collection (60,442) + HiveDB SQLite intel (17,551) |
HiveDB SQLite is the write authority | Qdrant hivemind is derivative (dual-write from hivemind.js) |
No hivemind HTTP API exists (confirmed: port 3001 is Drop API). Qdrant hivemind is only queryable via hivemind.js semantic search CLI, not accessible to other tools. |
4. Critical: The .md vs mem0 Wire Break
What was supposed to happen
The architecture assumes mem0 (http://localhost:9000) is the structured personal memory SoR for John. The mem0_john collection exists with 865 facts. The sessions collection has 929 entries. The server is alive and healthy.
What actually happens
Step 1 — .md files are written by Claude Code natively.
Claude Code has a built-in auto-memory feature that writes conversation summaries and facts as .md files into ~/.claude/projects/-Users-makinja/memory/. This is NOT a hook or daemon — it is a built-in Claude Code behavior. No line of code in ~/system/ controls this write.
Step 2 — lightrag-auto-ingest.sh hooks into the .md write.
File: ~/.claude/hooks/lightrag-auto-ingest.sh (PostToolUse on Write/Edit).
This hook detects when a .md file is written to ~/.claude/projects/-Users-makinja/memory/*.md and fires a background curl POST to LightRAG (http://localhost:9621/documents/text). This is the ONLY downstream pipeline from .md files.
Step 3 — mem0 API is never called.
Grep across all of:
~/system/tools/*.js— 0 files calllocalhost:9000~/.claude/hooks/*.sh— 0 files calllocalhost:9000~/system/daemons/— not scanned exhaustively but mem0-server plist confirms it's only a server, not a writerpi-orchestrator.js— the one hit forlocalhost:9000is SonarQube (port 9000 collision), not mem0
The exact wire break: There is no POST http://localhost:9000/add call anywhere in the active system. The mem0 server was built and populated (865 facts in mem0_john, 31,274 in knowledge) at some point — likely during initial setup or a one-time migration — but the "auto-write to mem0" integration was never wired into the live pipeline. The lightrag-auto-ingest.sh hook was written instead, routing .md → LightRAG, leaving mem0 as a read-only relic with stale data.
CEO complaint root cause confirmed: "implementation is not ideal — memory writes to .md files instead of mem0" is accurate. The intended SoR (mem0) has no active producer. The actual write path is: Claude Code → .md files → lightrag-auto-ingest.sh → LightRAG. mem0 is running, healthy, and populated with 865+31,274 stale vectors that nobody reads.
HiveDB relationship
HiveDB (hivemind.db) is a SEPARATE concern from personal memory. It is the agent broadcast / intel bus, not John's fact store. However, the Qdrant hivemind collection (60,442 vectors) lives in the same Qdrant instance as mem0_john, creating the appearance of a unified store when it is actually two separate logical systems sharing infrastructure.
5. Store Status Summary
| Store | Healthy? | Active Producer? | Active Consumer? | Data Fresh? |
|---|---|---|---|---|
mem0 / Qdrant mem0_john |
Yes | NO | NO | NO — 865 facts, stale |
mem0 / Qdrant knowledge |
Yes | NO | NO | NO — 31,274 vectors, stale |
mem0 / Qdrant sessions |
Yes | Unknown | NO | Unknown |
mem0 / Qdrant hivemind |
Yes | Yes (hivemind.js dual-write) | Yes (hivemind.js semantic search) | YES |
| HiveDB SQLite | Yes | Yes (hivemind.js CLI) | Yes (hivemind.js CLI) | YES — 17,551 rows |
| LightRAG | Yes | Yes (lightrag-auto-ingest.sh hook + outbox daemon) | Yes (discover.js) | YES — 999 docs, pipeline busy |
| Chroma | Yes (file exists) | UNKNOWN | UNKNOWN | Unknown origin |
| .md auto-memory | Yes | Yes (Claude Code native) | Partial (direct Read + LightRAG index) | YES — 123 files |
| Flywheel SQLite | Presumed yes | Yes (rag-router.js learn) | Yes (rag-router.js query) | Unknown |
Open Questions
-
Chroma write/read path: Who wrote 6,584 embeddings to
~/.claude-mem/chroma/cm__claude-mem? Which tool or MCP server reads from it? Theclaude-memMCP is referenced in settings but not found in scanned tool code. Needs:grep -r "claude-mem\|chroma" ~/.claude/settings.jsonand MCP server registry audit. -
Qdrant
sessionswriter: Who writes 929 session vectors to thesessionsQdrant collection?com.john.session-archiver.plistis a candidate but the script path was not read. Needs:cat ~/Library/LaunchAgents/com.john.session-archiver.plist+ script inspection. -
Qdrant
knowledgeorigin: 31,274 vectors inknowledge— when were they written and from what source? No active writer found. Possible: one-time BookStack bulk ingest or a migration. Check~/system/mem0/server.pyfor any bulk-load routines at startup. -
HiveDB vector divergence: 60,442 Qdrant vectors vs 17,551 SQLite intel rows. Are the extra ~43K vectors orphaned (deleted SQLite rows without Qdrant cleanup), or does Qdrant have independent content? Needs: sample Qdrant payload IDs vs SQLite
idcolumn cross-check. -
LightRAG external hostname:
discover.jsquerieshttps://lightrag.alai.no/query(external URL from config), nothttp://localhost:9621. Is there a Caddy/Cloudflare proxy routinglightrag.alai.no→localhost:9621? If that proxy is down,discover.jswould silently fail to read from LightRAG despite the local container being healthy. -
mem0_john 865 facts provenance: When were these written? Is there a one-time ingestion script (e.g.,
~/system/mem0/populate.pyor similar)? If the facts are high-quality (personal preferences, CEO directives), they are the most actionable store to re-wire as the active SoR. -
rag-router.jsflywheel.db size and health: Not probed live. Needssqlite3 ~/system/databases/flywheel.db "SELECT count(*) FROM interactions; SELECT count(*) FROM rag_cache;". -
mem0
server.py— does it expose/addor/searchroutes?: Confirmed health endpoint works. Need to verify actual API surface to confirm if a PostToolUse hook callingPOST localhost:9000/addwould work as-is without code changes to mem0.