AI Factory Audit 2026-05-09

Complete audit deliverables

Executive Summary (SENTINEL Final)
Connectivity Diagram
Inventory: Memory Plane
Inventory: Tools Shed
Inventory: Agent Fleet
Inventory: Daemon Fleet
Verifier Autonomy Audit
BUILD-BLUEPRINT Discipline
Health Matrix
Petter Synthesis
Devils Advocate
Fix Backlog
Validation Reports

Executive Summary (SENTINEL Final)

SENTINEL AUDIT — Final Consolidated Report

Date: 2026-05-09 Lead Validator: Sentinel Validator (consolidating P1–P5 findings) Destination: CEO (Alem Basic)

FINAL VERDICT

REWORK-MINOR

The audit is fundamentally sound. The fix backlog is correctly prioritized. The CEO can act on Wave A items (RAG drain-worker, queue monitoring, Chroma audit, B2 billing) immediately. However, 5 MC stubs require AC refinement (≤30 min each) before general dispatch, and P4.1 carries 3 low-severity annotation corrections. None of these are blockers to CEO decision-making or Wave A execution.

Headline (Bosnian)

Fabrika je mrtva od marta — 62.5% obaveza ne radi. Pi-orchestrator nije dispatchovao ništa. John je ručni dispecer. Tri fixa otključavaju sve ostalo: RAG Vaultwarden kredencijal, definišite canonical dispatch path, žičajte verify-fix-loop.

Top-5 Actionable Findings (Post-Corrections)

1. RAG ingest pipeline blocked — 3,150+ items queued (not the stale 454)

Finding: rag-drain-worker crashed on Vaultwarden CF Access timeout. The metric file is 16 days stale (shows 454). Live SQLite count: 3,150 queued items — real state is 7x worse than the documented figure.
Evidence: P3.1 H1 (health matrix), P5.2-verifier-report A5 (fresh queue depth probe showing 3,150), HiveMind #64900 (today's crash).
Action priority: CRITICAL — Fix immediately (MC-STUB-01, Wave A, ~2h effort). Single credential fix (Vaultwarden session + CF Access token) drains 3,150+ items simultaneously. This single fix unblocks 3 downstream adapters.

2. pi-orchestrator not dispatching — HTTP port 8401 dead since March

Finding: Process PID 75750 is alive. HTTP control plane is offline. No dispatch logs post-2026-03-19 (50+ days idle). durable-runner bridge (port 3052) is structurally alive but unclear if it's processing. The framing "mock mode" is inaccurate (P4.2 rebuttal) — the real issue is startup gating.
Evidence: P3.1 C1/C2 (live probes), P4.2 Gap #2 rebuttal (no mock config found; config shows offlineMode: false), P5.1 probe #3 (PID confirmed unchanged 5+ days).
Action priority: HIGH — But requires architectural decision first (MC-STUB-02, Wave B). Is durable-runner the canonical dispatcher (HTTP port 8401 is legacy), or is HTTP supposed to be online? The fix depends on the answer. Do not attempt MC-STUB-08 (pi-orch restore) until this decision is made.

3. Verifier loop capability exists but zero auto-invocation

Finding: verify-fix-loop skill is fully built, tested, and working. Accepts manual invocation. However, no daemon, hook, or pi-orchestrator code ever calls it. Important caveat (P4.2 rebuttal): This is NOT a structural gap. The REQUIRED verification gate is Proveo (Angie Jones), which IS wired via task-postflight. verify-fix-loop is an optional enhancement for self-correcting specs (docs, system, refactor domains).
Evidence: P2.2 §2, P3.1 D1 (skill exists, manual-only), P4.2 Gap #3 (Proveo is the designed gate), CLAUDE.md Hard Constraint #4 (specifies Proveo, not verify-fix-loop).
Action priority: MEDIUM — Feature enhancement, not blocker. Demoted to Wave C (MC-STUB-12) with L priority. Wire as optional section in /task-postflight after pi-orchestrator dispatch is restored.

4. Agent routing table incomplete — validator and distiller unmapped (44 references, 21 references, 0 routing entries)

Finding: validator and distiller agents are cited 65 times across skill files but have zero entries in specialist-mapping.json. Important distinction (P4.2 rebuttal): These may be INTERNAL-ONLY agents (called from other agents, not from John). If internal-only, they should NOT be in the routing table. If routable by John, they must be added. This requires a routing policy decision first.
Evidence: P1.3 (agent-fleet inventory shows 66 agents, mapping covers only 29), P4.2 Gap #5 rebuttal (may be internal-only), P4.3 MC-STUB-06 (design decision gates this fix).
Action priority: MEDIUM — Requires CEO Decision #3 (routing policy scope: comprehensive vs curated). Once decided, implementation is ≤8h (MC-STUB-06, Wave B).

5. Four phantom companies unroutable (Axiom, Datavera, Resolver, Lexicon)

Finding: All four have complete persona directories (CLAUDE.md, agents, company.json). ZERO entries in specialist-mapping.json. Correction (P4.2 rebuttal + P5.2-verifier A10): Lexicon IS routable (grep confirms 0 matches — P4.2 hallucinated a mapping entry). So the correct count is 3 phantom companies (Axiom, Datavera, Resolver), not 4. Lexicon is confirmed absent and phantom.
Evidence: P1.3 (inventory shows all 4 have full infrastructure), P4.2 Gap #7 (rebuttal claims Lexicon is mapped — REFUTED by P5.2-verifier), P4.3 MC-STUB-07 (correctly lists 3 companies).
Action priority: LOW — Inventory work + routing decision. Demoted to Wave B after routing policy (MC-STUB-06) is decided. MC-STUB-07 implements the fix for 3 companies (~4h effort, M priority).

Wave A — Ship Now (No CEO Decisions Needed)

These four MCs can be dispatched immediately. Combined effort: ~6h.

Stub	Title	Effort	Owner	Why Safe to Ship
MC-STUB-01	Restore RAG drain-worker: fix Vaultwarden session + CF Access	S (≤2h)	FlowForge	Single credential fix. Machine-checkable ACs. Proveo-validated PASS (5.1 §2). Unblocks 3 adapters.
MC-STUB-03	Implement live RAG queue depth monitoring	M (≤8h)	FlowForge	Proveo PASS (5.1 §2). Depends on MC-STUB-01 (documented). No CEO decision required.
MC-STUB-09	Audit and archive Chroma + stale mem0 collections	S (≤2h)	CodeCraft	Proveo PASS (5.1 §2). Pure read-probe + cleanup. No blocking dependencies.
MC-STUB-10	Raise B2 storage cap + verify litestream replication	S (≤2h)	FlowForge	Proveo WEAK (credential placeholder needs fix — see rework list). But the task itself is low-risk (billing action). Fix AC before dispatch (≤5 min).

Wave A partial: MC-STUB-04 (restore 5 deleted plists) — 4 of 5 plists can be unloaded/restored now. The 5th (pi-orch-health.sh) is blocked on MC-STUB-02 (canonical dispatch decision) because the health probe must be updated to check the right port.

Wave B — Needs CEO Architectural Decisions First

These fixes depend on 4 CEO decisions. Once decided, they are unblocked.

CEO Decision #1 (CRITICAL): Canonical dispatch path

The question: Is durable-runner (port 3052, 20d uptime) the canonical dispatcher — with pi-orchestrator HTTP (port 8401, dead) being a legacy control plane? OR is pi-orchestrator HTTP supposed to be online?

Why only CEO can decide: This is a fork in how we interpret the system's design. No engineer can unilaterally choose which dead component to revive.

Options:

A. durable-runner is canonical. HTTP port 8401 is legacy. Document this, verify durable-runner is processing tasks, decommission HTTP.
B. pi-orch HTTP is canonical. Diagnose startup gating (likely Ollama hang), restore it. durable-runner is subordinate.
C. Both should be operational. Requires specifying the interaction model.

Unblocks:

MC-STUB-02 (design decision itself)
MC-STUB-04 remainder (pi-orch-health.sh restoration)
MC-STUB-08 (pi-orchestrator restore — actual kernel fix)

CEO Decision #2 (MEDIUM): Blueprint score gate floor

The question: What is the enforced minimum score for dispatch via Mehanik gate?

Context: Observed practice allows dispatch at score 65 (WARN range). Original spec says 90. The code treats WARN as pass-through. Choose one and hardcode it.

Options:

A. Lower floor to 60 — match observed practice; WARN is acceptable.
B. Floor stays at 90 — WARN becomes BLOCK; blueprints must score higher.
C. Tiered: 60 for L tasks, 75 for M, 90 for H+.

Unblocks: MC-STUB-05 (enforce gate at the chosen floor)

CEO Decision #3 (MEDIUM): specialist-mapping.json scope policy

The question: Should the routing table be comprehensive (all 66 agents) or curated (only John-dispatchable agents)?

Why it matters: validator and distiller are cited 65 times but may be internal-only. If internal, they must NOT be in the routing table. If John-routable, they must be added.

Options:

A. Curated — only John-dispatchable agents enter the mapping. Internal agents documented separately.
B. Comprehensive — all agents mapped; entry type field distinguishes dispatch vs internal.

Unblocks:

MC-STUB-06 (routing policy design + specialist-mapping update)
MC-STUB-07 (register 3 phantom companies or mark as experimental)

CEO Decision #4 (LOW): mem0 future role

The question: What is mem0's long-term status?

Context: 865 stale facts. Zero active writers. .md + LightRAG is the working pipeline. mem0 server running and consuming resources.

Options:

A. Deprecate — stop mem0 server; archive Qdrant vectors; remove from settings.json.
B. Keep experimental — document as optional parallel sandbox, not canonical.
C. Promote — wire PostToolUse hook to write every .md update to mem0 simultaneously (high effort, not recommended).

Recommendation (Petter): Option A (deprecate). The .md pipeline works. mem0 is cognitive overhead.

Unblocks: MC-STUB-09 + MC-STUB-11 (memory-plane documentation)

Surfaced Contradictions Resolved

Contradiction 1: RAG queue depth — 454 vs 3,150

P4.1 synthesis stated: Queue depth 454 (from stale metric). P5.2 verifier caught: Live SQLite shows 3,150 queued items (16 days newer data).

Resolution: Both figures are correct — the metric file is 16 days stale. The synthesis should have emphasized the live count (3,150) or stated "actual count unknown; 454 is a lower bound from 16 days ago." This is a severity understatement, not a factual error. MC-STUB-01 AC#5 requires live queue monitoring to prevent future metric staleness.

Contradiction 2: pi-orchestrator "mock mode" vs actual config

P2.1 connectivity diagram stated: pi-orch in MOCK MODE, alai-config-mock.json loaded. P4.2 devils-advocate rebutted: No mock config found. Config shows offlineMode: false, enabled: true. P3.1 verified: Zero grep matches for "mock" in pi-orchestrator.js.

Resolution: The "mock mode" framing is inaccurate. The real issue is HTTP port 8401 startup gating (likely an initialization hang, not intentional test mode). P4.1 executive summary repeats "mock/broken mod" but should be updated to "HTTP startup gating failure" per P3.1/P4.2 evidence.

Contradiction 3: Chain runner existence

P4.1 synthesis stated: 35 chain YAML files have no executor; chain-runner doesn't exist. P5.2 verifier caught: chain-runner.js (31KB, fully functional) and chain-runner.sh (Pillar #5) both exist.

Resolution: Chain runners DO exist. They are not broken in the sense of missing — they are broken/unused because:

(a) No active skill invokes them (skills call agents inline),
(b) Three chain-related daemons exit 1 due to downstream failures,
(c) The runners are un-integrated, not absent.

The correct claim is "chains are un-invoked and un-integrated," not "no executor exists." This distinction matters for the fix: restoring chains requires fixing downstream dependencies, not writing a new runner.

Contradiction 4: Lexicon company phantom status

P4.1 Gap #7 stated: 4 phantom companies — Axiom, Datavera, Resolver, Lexicon. P4.2 devils-advocate claimed rebuttal: Lexicon IS in specialist-mapping.json. P5.2 verifier caught: grep "Lexicon" ~/system/agents/specialist-mapping.json → 0 matches. Lexicon is NOT routable.

Resolution: P4.2 hallucinated the Lexicon entry (ZAKON NULA breach). The correct count is 4 phantom companies, not 3. P4.3 MC-STUB-07 correctly lists the affected companies as the full 4 in some passages but may have been partially rewritten. This audit's final count: all 4 are confirmed unroutable (Axiom, Datavera, Resolver, Lexicon). Update MC-STUB-07 scope to list all 4.

Contradiction 5: mem0 SoR intent

P4.1 synthesis stated: mem0 is the intended System of Record; it's broken. P4.2 devils-advocate rebutted: mem0 was never designated as SoR in CLAUDE.md or any spec.

Resolution: The gap is dismissed (correctly). .md + LightRAG is the designed pipeline (Claude Code native auto-memory → lightrag-auto-ingest.sh hook → LightRAG). mem0 was a prototype that never achieved SoR status. The correct fix is documentation (MC-STUB-11), not re-wiring mem0. This satisfies the dismissed gap.

Contradiction 6: HiveMind read API

P1.1 implied: HiveMind has no read API. P3.1 found: hivemind.js read/query/semantic_query all functional. API exists.

Resolution: P1.1 overstated the gap. HiveMind is the healthiest store in the factory (17,560+ live intel rows, read API functional, daily writes). No contradiction to resolve — P3.1 corrected the inventory claim.

Open Questions for CEO

Canonical dispatch path: durable-runner or pi-orchestrator HTTP? (CEO Decision #1)
Blueprint score gate: Enforce at 60, 75, or 90? (CEO Decision #2)
specialist-mapping.json scope: Comprehensive or curated? (CEO Decision #3)
mem0 future role: Deprecate or keep as experimental? (CEO Decision #4)
Anything else surfaced: Any findings in this audit that require clarification before we proceed with Wave A?

Recommendation

John should dispatch Wave A immediately (RAG drain-worker, queue monitoring, Chroma audit, B2 cap raise — ~6h total). These are unblocked and low-risk. While Wave A runs, John should surface CEO Decision #1 (canonical dispatch path) to the CEO and gather answers for Decisions #2–4. Once Decision #1 is resolved, Wave B becomes unblocked and John can schedule MC-STUB-02 (design decision) + the downstream fixes (pi-orch-health.sh, pi-orchestrator restore, routing policy). The audit is sound. The backlog is prioritized. The next blocker is not more analysis — it is the CEO's architectural calls.

Rework Required Before General Dispatch

Category A — AC refinement (5 stubs, ≤30 min each):

MC-STUB-04: Split OR-condition into per-plist ACs; replace 24h window with point-in-time exit-code check.
MC-STUB-06: Rewrite discover.js routing ACs to assert the specific agent returned (not just "non-empty"); make count-diff self-contained.
MC-STUB-08: Replace 5-min wait AC with point-in-time dispatch log check; replace 30-min cron monitoring with a statement that cron probe is a child task.
MC-STUB-10: Replace credential placeholder with bw get item command; add log-file existence check.
MC-STUB-12: Define the "postflight log" artifact path; specify task-postflight invocation mode or output.

Category B — P4.1 annotations (≤15 min):

Replace "mock/broken mod" in executive summary with "HTTP startup gating failure."
Update Gap #7 to note P4.2 rebuttal revised count (but P5.2-verifier refutes that rebuttal — final count is 4 phantom companies, not 3).
Clarify that "93K+ vectors" is raw Qdrant embeddings across all collections, not mem0-only count (865 facts is the mem0 application-layer count).

Audit Status: COMPLETE Validator: Sentinel Validator (consolidation) Evidence directory: /tmp/ai-factory-audit-2026-05-09/ Prior phases: P1 (inventory), P2 (connectivity), P3 (health matrix), P4 (synthesis + rebuttal + backlog), P5 (validation + verification + final consolidation)

Report produced by Sentinel Validator 2026-05-09 Consolidated from 11 audit reports + 3 rebuttal layers + live probe verification

Connectivity Diagram

2.1 — AI Factory Connectivity Diagram

Date: 2026-05-09 Auditor: sentinel-architect Phase: 2 — Synthesis from P1 inventory reports 1.1, 1.2, 1.3, 1.4 and P2 reports 2.2, 2.3 Mode: READ-ONLY. No mutations.

Section A — Control Plane Diagram

The diagram below shows the advertised flow from CEO input to task closure. Solid arrows are flows that actually work. Dotted red arrows are advertised edges that are broken or absent. Labels show the transport mechanism.

flowchart TD
    CEO([CEO / Alem])
    JOHN([John — Orchestrator\nClaude Code CLI session])
    MH["/mehanik gate\n~/.claude/agents/mehanik.md\n113 cleared tokens in /tmp"]
    PF["/prompt-forge\n~/.claude/skills/prompt-forge/"]
    PIO["pi-orchestrator\n~/system/kernel/pi-orchestrator.js\nPID 75750 — MOCK MODE"]
    SPEC["Specialist Agent\ne.g. petter-graff, angie-jones\n~/.claude/agents/*.md"]
    TOOL["Tool\n~/system/tools/ (250 live)"]
    ART["Artifact\n(code / doc / spec / evidence file)"]
    VERIFIER["Verifier / verify-fix-loop\n~/.claude/agents/verifier.md\n~/.claude/skills/verify-fix-loop/"]
    TPF["/task-postflight\n~/.claude/skills/task-postflight/"]
    MCD["mc.js done\n~/system/tools/mc.js"]
    PROVEO["Proveo / Angie Jones\n~/.claude/agents/angie-jones.md"]
    HOOK["Hook Layer\n~/.claude/hooks/ (12 active)"]

    CEO -- "CLI conversation" --> JOHN
    JOHN -- "CLI / Task dispatch" --> MH
    MH -- "cleared token written to /tmp/mehanik-cleared-N\nBlueprint read enforced (PARTIALLY — WARN scores pass)" --> PF
    PF -- "forged prompt → Task dispatch" --> PIO

    PIO -. "Task dispatch — mc.js write\nBROKEN: MOCK MODE\nalai-config-mock.json loaded\nPlanka localhost:3100 not listening\n'No eligible tasks' every 30s" .-> SPEC

    SPEC -- "Tool calls\n(Read / Edit / Bash / Grep)" --> TOOL
    TOOL -- "Write / Edit" --> ART
    ART -- "mc.js ready write" --> HOOK
    HOOK -- "PreToolUse / PostToolUse\nexits 0 = pass, exits 2 = block" --> MCD

    MCD -. "ADVERTISED: auto-invokes verifier\nACTUAL: ABSENT\n0 hooks, daemons, or pi-orch code\ncalls verify-fix-loop\n(source: 2.2)" .-> VERIFIER

    VERIFIER -. "ADVERTISED: auto-loop to fix-builder\nACTUAL: manual invocation only\nno programmatic trigger" .-> SPEC

    MCD -- "mc.js ready → /task-postflight\n(manual invocation only for H tasks)" --> TPF
    TPF -- "Task dispatch — CLI" --> PROVEO
    PROVEO -- "AC checklist → verdict" --> TPF
    TPF -- "mc.js done (with evidence)" --> MCD

    MCD -. "ADVERTISED: pi-orchestrator consumes\n'done' events for next task\nACTUAL: MOCK MODE — consuming nothing" .-> PIO

    style PIO fill:#ffcccc,stroke:#cc0000
    style VERIFIER fill:#ffcccc,stroke:#cc0000
    style MH fill:#ffffcc,stroke:#cccc00

Annotation notes:

CEO → John: works. Standard CLI session.
John → Mehanik: works. 113 cleared tokens confirm Mehanik runs regularly.
Mehanik → prompt-forge → pi-orchestrator: the dispatch chain exists structurally. pi-orchestrator is alive (PID 75750) but in MOCK MODE — it reads mock config and never consumes real MC tasks.
pi-orchestrator → Specialist: BROKEN because mock mode means pi-orchestrator never fires a Task dispatch to a real specialist.
Specialist → Tool → Artifact: works when agents are dispatched by John manually (not via pi-orchestrator).
Artifact → mc.js done (via hooks): works. The hook layer (12 active hooks) enforces gates on mc.js writes.
mc.js done → verifier: ABSENT. No automated trigger. CEO is the de-facto verifier (source: 2.2).
mc.js done → pi-orchestrator: BROKEN. Mock mode means pi-orchestrator does not react to task completions.

Section B — Data Plane Diagram

Shows all memory stores with their actual write paths (solid = live, dotted red = dead, dotted orange = partial/degraded).

flowchart LR
    CC["Claude Code\n(built-in auto-memory)"]
    MDFILES[".md auto-memory files\n~/.claude/projects/-Users-makinja/memory/\n123 files — LIVE"]
    HOOK_LR["lightrag-auto-ingest.sh\nPostToolUse hook\nfires on Write/Edit to in-scope paths"]
    LR["LightRAG\nlocalhost:9621\n999 docs indexed\npipeline_busy=true\nHEALTHY but DEGRADED"]
    DISCOVER["discover.js\nhttps://lightrag.alai.no/query\n(external hostname — Caddy proxy)"]

    IQ["ingest-queue.sqlite\n~/system/state/\n946 items FROZEN"]
    RDW["rag-drain-worker\nPID 3640\nETIMEDOUT on Vaultwarden"]
    RBA["rag-bookstack-adapter\nevery 5min — exit 256\nblocked by backpressure"]
    RMCA["rag-mc-adapter\nevery 5min — exit 256\nblocked by backpressure"]
    RFSEA["rag-fsevents-adapter\nWatchPaths — exit 1\nblocked by backpressure"]
    BKS["BookStack\ndocs.alai.no"]
    MCLOG["mc-task-outcomes.jsonl\n~/system/logs/"]

    MEM0["mem0 API\nlocalhost:9000\nHEALTHY — 0 active writers"]
    QDR["Qdrant\nlocalhost:6333\n5 collections\n93,510 total vectors"]
    MEM0J["mem0_john collection\n865 vectors — STALE"]
    KNOW["knowledge collection\n31,274 vectors — STALE\nunknown origin"]
    SESS["sessions collection\n929 vectors — unknown writer"]
    HIVE_Q["hivemind collection\n60,442 vectors — LIVE"]
    HIVEJS["hivemind.js CLI\ndual-write on post"]
    HIVEDB["HiveDB SQLite\nhivemind.db\n17,551 intel rows — LIVE"]

    CHROMA["Chroma\n~/.claude-mem/chroma/\n6,584 embeddings\nno active writer or reader"]

    FLYWHEEL["flywheel.db SQLite\n~/system/databases/\nLIVE — rag-router.js cache"]
    RAG_ROUTER["rag-router.js\ncache → Ollama → external"]

    CC -- "native write" --> MDFILES
    MDFILES -- "PostToolUse trigger" --> HOOK_LR
    HOOK_LR -- "curl POST localhost:9621" --> LR
    LR -- "serves queries" --> DISCOVER

    BKS -- "poll every 5min" --> RBA
    MCLOG -- "tail" --> RMCA
    RBA -- "enqueue" --> IQ
    RMCA -- "enqueue" --> IQ
    RFSEA -- "enqueue" --> IQ
    IQ -- "drain attempt" --> RDW
    RDW -. "DEADLOCKED\nVaultwarden ETIMEDOUT\nCF Access creds missing\n946 items queued, 0 drained" .-> LR

    HIVEJS -- "write" --> HIVEDB
    HIVEJS -- "dual-write best-effort" --> HIVE_Q
    HIVE_Q --> QDR
    MEM0 --> QDR
    QDR --> MEM0J
    QDR --> KNOW
    QDR --> SESS
    QDR --> HIVE_Q

    CC -. "INTENDED: POST localhost:9000/add\nACTUAL: ABSENT\n0 callers in hooks/tools/daemons" .-> MEM0
    DISCOVER -. "INTENDED: query mem0 for personal facts\nACTUAL: ABSENT\ndiscover.js does not call localhost:9000" .-> MEM0

    CHROMA -. "writer UNKNOWN\nreader UNKNOWN\n6584 embeddings orphaned" .-> CHROMA

    RAG_ROUTER -- "learn" --> FLYWHEEL
    RAG_ROUTER -- "query cache-hit" --> FLYWHEEL

    style RDW fill:#ffcccc,stroke:#cc0000
    style IQ fill:#ffcccc,stroke:#cc0000
    style MEM0 fill:#fff0cc,stroke:#cc8800
    style MEM0J fill:#ffcccc,stroke:#cc0000
    style KNOW fill:#ffcccc,stroke:#cc0000
    style CHROMA fill:#ffcccc,stroke:#cc0000
    style SESS fill:#fff0cc,stroke:#cc8800

Key findings:

The LightRAG local write path (Claude Code → .md → hook → LightRAG) works but the queue-drain path (746+ items from bookstack, MC logs, fsevents) is completely deadlocked because rag-drain-worker cannot authenticate through Cloudflare Access (Vaultwarden ETIMEDOUT).
mem0 is a ghost: server alive, 93K+ vectors in Qdrant, zero active writers, zero active readers through the API.
Chroma is a full orphan: 6,584 embeddings from an unknown writer, no identified reader.
The Qdrant hivemind collection (60K+ vectors) is live because hivemind.js writes to it directly, bypassing the mem0 API entirely — this is the only healthy Qdrant write path.

Section C — Agent / Persona / Chain Plane

flowchart TD
    SMJ["specialist-mapping.json\n~/system/agents/specialist-mapping.json\n29 mapped agents\n9 registered companies\nSOURCE OF TRUTH (incomplete)"]

    CLAUDE_AGENTS["~/.claude/agents/\n66 .md files\nRUNTIME STORE\n(what Claude Code can dispatch)"]
    DEFINITIONS["~/system/agents/definitions/\nBACKUP STORE\n48 synced + 8 definitions-only"]
    SYNC["~/bin/agent-definitions-sync.sh\nMANUAL — not scheduled"]
    PERSONAS["~/system/agents/personas/\n12 persona dirs"]

    P_REAL["8 Routable Companies\nAgentForge, CodeCraft, Finverge\nFlowForge, Proveo, Securion\nSkybound, Vizu\n(partial mapping only)"]
    P_PHANTOM["4 Phantom Companies\nAxiom, Datavera, Resolver, Lexicon\nFull persona dirs, CLAUDE.md, agents/\n0 entries in specialist-mapping.json\nDispatch path = NONE via John routing"]

    CHAINS["~/system/agents/chains/\n35 .yaml files\nNO chain runner exists\nall DEAD as executable automation"]

    MAPPED_OK["24 mapped agents\nreachable on disk\nCAN be dispatched"]
    MAPPED_MISSING["7 mapped agents\nIN specialist-mapping.json\nMISSING from ~/.claude/agents/\ndispatches SILENTLY FAIL\n(dorota-huizinga, hadi-hariri\njames-bach, lee-robinson\nlisa-crispin, minion\nanthropicchief-architect=fully phantom)"]

    UNMAPPED_CRITICAL["Critical unmapped agents\nIN ~/.claude/agents/\nNOT in specialist-mapping.json:\n- validator (44 skill refs)\n- distiller (21 chain refs)\n- mehanik (7 skill refs)\n- evidence-verifier\n- baseline-comparator\n- dzevad-jahic (Lexicon)\n- planner (phantom — in chains only)"]

    UNMAPPED_ORPHAN["11 Orphan agents\nno chain/skill/daemon refs:\n0.md, dr-sarah-chen, Explore\nhelixsupport, indy-dandev\nmaria-santos, meta-agent\nPlan, rag-builder\nredzo-reviewer, thaer-sabri"]

    SMJ --> CLAUDE_AGENTS
    SMJ -. "7 mapped agents\nnot on disk = UNREACHABLE" .-> MAPPED_MISSING
    CLAUDE_AGENTS --> MAPPED_OK
    CLAUDE_AGENTS --> UNMAPPED_CRITICAL
    CLAUDE_AGENTS --> UNMAPPED_ORPHAN
    DEFINITIONS -- "manual sync\n(agent-definitions-sync.sh)" --> CLAUDE_AGENTS
    SYNC -. "not scheduled\ndrift pressure continuous" .-> DEFINITIONS
    PERSONAS --> P_REAL
    PERSONAS --> P_PHANTOM
    P_PHANTOM -. "no routing entry\ndirect session name-drop only\nundocumented and unreliable" .-> CLAUDE_AGENTS
    P_REAL --> SMJ
    CHAINS -. "NO EXECUTOR\n35 YAML files are docs only\nSkills call agents inline\nnot via chain runner" .-> CLAUDE_AGENTS

    style MAPPED_MISSING fill:#ffcccc,stroke:#cc0000
    style P_PHANTOM fill:#fff0cc,stroke:#cc8800
    style CHAINS fill:#ffcccc,stroke:#cc0000
    style UNMAPPED_CRITICAL fill:#fff0cc,stroke:#cc8800

Key findings:

specialist-mapping.json covers only 29 of 66 agents (44%). The two highest-usage agents system-wide — validator (44 skill file refs) and distiller (21 chain refs) — are completely absent from the routing table.
7 agents are mapped (John thinks he can dispatch them) but physically missing from ~/.claude/agents/. Any dispatch attempt silently fails.
35 chain YAML files have no executor. They exist as documentation only — skills invoke agents inline and ignore chain files entirely.
4 phantom companies (Axiom, Datavera, Resolver, Lexicon) have full organizational infrastructure on disk but are completely invisible to John's routing system.

Section D — The True Picture (CEO-readable, 60 seconds)

Plan vs. Reality

The architecture diagram on paper shows: CEO gives task → John gates it through Mehanik → pi-orchestrator dispatches specialists → work gets done → verifier autonomously checks it → mc.js closes the loop.

The actual flow is: CEO gives task → John manually dispatches a specialist in the current conversation → specialist builds → John manually verifies (or CEO does) → John manually calls mc.js done.

Every automatic layer between "task received" and "task closed" is either in mock mode, deadlocked, or simply absent.

The 3 Fattest Dead Edges

Dead Edge 1 — pi-orchestrator in MOCK MODE. The orchestration kernel (PID 75750) is alive and cycling every 30 seconds. It reads alai-config-mock.json. Planka/MC API at localhost:3100 is not listening. The kernel prints "No eligible tasks" and does nothing. Every task that should flow automatically through the factory instead requires John to manually dispatch via conversation. This is the single edge whose repair would convert the factory from "manual assembly" to "automated pipeline."

Dead Edge 2 — RAG drain-worker deadlocked (946 items queued, 0 drained). Three adapters (BookStack, MC logs, filesystem events) successfully enqueue documents into ingest-queue.sqlite. The drain-worker (PID 3640) picks them up and tries to POST to LightRAG through Cloudflare Access — but Vaultwarden times out, so CF credentials cannot be fetched. The entire 946-item queue has been frozen. Meanwhile, the fsevents adapter is watching for filesystem changes and trying to enqueue lightrag-monitor health files — creating a feedback loop where the monitoring system feeds into the broken pipeline it is monitoring. One credential fix (valid /tmp/bw-session + reachable Vaultwarden) unblocks all three adapters simultaneously.

Dead Edge 3 — Verifier auto-invocation ABSENT. The verify-fix-loop skill and its verifier + fix-builder agents are fully specified and internally correct. There is zero wiring to any automated trigger. No hook, no daemon, no pi-orchestrator code calls them. When mc.js ready fires, no verification agent is invoked. CEO is the de-facto quality gate for the entire factory. One wiring point in /task-postflight SKILL.md (Section 2b) would give autonomous verification for non-high-stakes tasks immediately, without new infrastructure.

The 3 Highest-Leverage Wire Fixes

Fix 1 — Restore pi-orchestrator real config (L fix, maximum leverage). Determine why alai-config-mock.json loads instead of real config. If Planka is intentionally offline, restore it or point the orchestrator at the real MC API endpoint. This single fix converts the factory from "John as human dispatcher" to "automated task routing." Impact: every other automation layer (specialist dispatch, postflight, cost tracking) becomes meaningful instead of idle.

Fix 2 — Fix rag-drain-worker CF credentials (S fix, unblocks 946-item queue). Ensure Vaultwarden is reachable and /tmp/bw-session is valid for the service token that holds the LightRAG CF Access credentials. This is estimated as a 30-minute fix (refresh session token + verify vault connectivity). Impact: 946 queued RAG items drain, BookStack sync resumes, MC outcome logging resumes, the circular monitoring feedback loop breaks.

Fix 3 — Wire verify-fix-loop into /task-postflight (M fix, eliminates CEO-as-verifier bottleneck). Add a Section 2b to ~/.claude/skills/task-postflight/SKILL.md: after Proveo passes AC checklist, dispatch /verify-fix-loop for docs / system / refactor / polish domain tasks (MAX_LOOPS=3, $5 cap already defined in the skill). This requires no new infrastructure — the skill conversation context already supports Task dispatch. Impact: CEO is removed from the quality loop for the majority of non-high-stakes tasks.

Section E — Edge Inventory Table

#	From	To	Transport	Status	Evidence	Fix Size
1	CEO	John (orchestrator)	CLI conversation	LIVE	Observed every session	—
2	John	/mehanik gate	Task dispatch / CLI	LIVE	113 cleared tokens in /tmp	—
3	/mehanik gate	Blueprint read	Read tool call	PARTIAL	CB#2 enforced; WARN scores (65/80) pass; missing-MC-ID bypasses gate entirely (2.3)	S
4	/mehanik gate	/prompt-forge	CLI / Task dispatch	LIVE	Observed in token chain	—
5	/prompt-forge	pi-orchestrator	mc.js write / Task	PARTIAL	pi-orch alive but MOCK MODE (1.4)	L
6	pi-orchestrator	Specialist agent	Task dispatch	DEAD	MOCK MODE — "No eligible tasks" every 30s; Planka localhost:3100 not listening (1.4)	L
7	John (manual)	Specialist agent	Task dispatch (CLI)	LIVE	Observed — this is the actual dispatch path	—
8	Specialist agent	Tools (Read/Edit/Bash)	Tool API calls	LIVE	250 live tools verified (1.2)	—
9	Tools	Artifact (file/code)	Write / Edit	LIVE	Standard Claude Code behavior	—
10	Artifact	mc.js ready	mc.js write + hook	LIVE	mc-ready-gate.sh fires; 12 active hooks (2.2)	—
11	mc.js ready	verifier / verify-fix-loop	(absent)	DEAD	0 hooks, 0 daemons, 0 pi-orch code calls verify-fix-loop (2.2)	M
12	mc.js ready	/task-postflight	Manual CLI invocation	PARTIAL	H-tasks only; manual trigger; no auto-invocation (2.2)	M
13	/task-postflight	Proveo / Angie Jones	Task dispatch	LIVE	Skill dispatches angie-jones.md; present on disk (2.2)	—
14	Proveo	mc.js done	mc.js write	LIVE	AC checklist → done path works	—
15	mc.js done	pi-orchestrator (next task)	mc.js event / API	DEAD	MOCK MODE — pi-orch does not react to done events (1.4)	L
16	Claude Code built-in	.md memory files	Native write	LIVE	123 files, auto-written by Claude Code (1.1)	—
17	.md memory files	lightrag-auto-ingest.sh	PostToolUse hook trigger	LIVE	Hook fires on Write/Edit to in-scope paths (1.1)	—
18	lightrag-auto-ingest.sh	LightRAG localhost:9621	curl POST	LIVE	999 docs indexed; pipeline_busy=true (1.1)	—
19	discover.js	LightRAG (external)	HTTPS GET to lightrag.alai.no	LIVE	External hostname via Caddy proxy (1.1)	—
20	rag-bookstack-adapter	ingest-queue.sqlite	SQLite write	DEAD	Exit 256 — backpressure gate (946 > 500) from frozen drain-worker (1.4)	S
21	rag-mc-adapter	ingest-queue.sqlite	SQLite write	DEAD	Exit 256 — same backpressure cascade (1.4)	S
22	rag-fsevents-adapter	ingest-queue.sqlite	SQLite write / WatchPaths	DEAD	Exit 1 — blocked by backpressure; also feeding monitoring artifacts into queue (1.4)	S
23	rag-drain-worker	LightRAG (via CF Access)	HTTPS POST (authenticated)	DEAD	Vaultwarden ETIMEDOUT — CF credentials unavailable; 946 items queued, 0 drained (1.4)	S
24	Any tool/hook/daemon	mem0 API localhost:9000	HTTP POST	DEAD	0 callers found in all of ~/system/tools, ~/.claude/hooks, ~/system/daemons (1.1)	M
25	discover.js	mem0 API	HTTP GET	DEAD	discover.js does not query localhost:9000 (1.1)	M
26	mem0 API	Qdrant mem0_john collection	gRPC / HTTP	PARTIAL	Server healthy; mem0_john has 865 stale vectors; no active writer to keep them fresh (1.1)	M
27	hivemind.js	HiveDB SQLite	SQLite write	LIVE	17,551 intel rows; write path active (1.1)	—
28	hivemind.js	Qdrant hivemind collection	HTTP (qdrant-client)	LIVE	60,442 vectors; dual-write best-effort (1.1)	—
29	Chroma store	Any consumer	(unknown)	DEAD	6,584 embeddings, no traced writer or reader (1.1)	M
30	agent-definitions-sync.sh	~/.claude/agents/	file copy	PARTIAL	48 files synced; 8 definitions-only agents unreachable at runtime; sync not scheduled (1.3)	S
31	specialist-mapping.json	Dispatch routing	JSON lookup	PARTIAL	29/66 agents mapped; validator (44 refs) and distiller (21 refs) absent; 7 mapped agents missing from disk (1.3)	M
32	35 chain YAML files	chain runner / executor	(absent)	DEAD	No chain runner exists; skills call agents inline; chains are documentation only (1.3)	L
33	John routing	Axiom/Datavera/Resolver/Lexicon	discover.js lookup	DEAD	4 companies absent from specialist-mapping.json; routing impossible via normal path (1.3)	M
34	pi-orch-health monitor	pi-orchestrator health signal	shell script	DEAD	pi-orch-health.sh deleted; last verdict 2026-05-06 CRITICAL; dark since (1.4)	S
35	cost-daily-report daemon	daily cost visibility	shell script	DEAD	cost-daily-report.sh deleted; cost reporting dark since 2026-04-29 — 10 days (1.4)	S
36	mc-ready-gate.sh	Blueprint score enforcement	blueprint-check.js	PARTIAL	Check runs; WARN scores (65, 80) allow dispatch; threshold 90 is advisory only (2.3)	S
37	Mehanik	Session binding validation	token mehanik_session_id	DEAD	All 113 inspected tokens show mehanik_session_id: unknown; cross-session reuse possible (2.3)	S
38	b2-offsite-backup	B2 cloud storage	B2 API	DEAD	403 storage_cap_exceeded; nightly snapshots not landing (1.4)	S
39	litestream	B2 replication stream	B2 API	PARTIAL	Litestream PID alive; separate nightly job fails; live replication status uncertain (1.4)	S
40	slack-bot	Slack WebSocket	Socket Mode	PARTIAL	PID 18046 alive; last crash exit 1; 300min silent at audit time; reconnects on timeout (1.4)	S

Status key:

LIVE — flow confirmed working by tool-verified evidence
DEAD — flow confirmed broken or absent by tool-verified evidence
PARTIAL — flow structurally exists but has gaps, bypass paths, or degraded throughput

Fix size:

S — Small: under 4 hours, single-file or credential change
M — Medium: 1–2 days, new wiring or multi-file coordination
L — Large: 3+ days, architectural change or multi-system coordination

Summary Statistics

Category	Count
Total edges inventoried	40
LIVE	15
DEAD	15
PARTIAL	10
Edges repairable with S fix	10
Edges repairable with M fix	8
Edges repairable with L fix	3

The factory has a 37.5% live edge rate. The remaining 62.5% of advertised flows are either dead or degraded. The 3 L-fixes (pi-orchestrator mock mode, chain runner, verifier auto-invocation architecture) unblock the most downstream flows if resolved. The 10 S-fixes are individually cheap and collectively close significant operational blind spots (cost reporting, RAG drain, blueprint score enforcement, monitoring, B2 backup).

Inventory: Memory Plane

Memory Plane Inventory — AI Factory Audit

Date: 2026-05-09
Auditor: Chip Huyen (AgentForge)
Scope: Read-only probe. No mutations.
Task: Plan Task 1.1 — Memory Plane Inventory

1. Per-Store Table

Store	Endpoint / Path	Schema / Collections	Live Count	Write Path	Read Path	Owner Daemon	Status
mem0 / Qdrant	`http://localhost:9000` (mem0 API) / `http://localhost:6333` (Qdrant gRPC+HTTP)	5 collections: `mem0migrations` (0 pts), `sessions` (929 pts), `hivemind` (60,442 pts), `mem0_john` (865 pts), `knowledge` (31,274 pts)	93,510 total vectors	No caller found. mem0 API (`POST /add`) is NEVER called by any hook, tool, or daemon in `~/system/tools/` or `~/.claude/hooks/`. `hivemind.js` dual-writes to Qdrant `hivemind` collection directly via internal HTTP (port 6333).	No tool reads `localhost:9000` for queries. `hivemind.js semantic search` reads Qdrant `hivemind` collection directly via `qdrant-client`. `discover.js` does NOT query mem0.	`com.alai.mem0-server` (LaunchAgent, KeepAlive=true, PID 65706 alive, last exit was SIGTERM -15)	HEALTHY (server alive, but ORPHANED — no producer writes to `mem0_john` or `knowledge` via the mem0 API)
Chroma	`~/.claude-mem/chroma/chroma.sqlite3`	1 collection: `cm__claude-mem`	6,584 embeddings	Unknown — no daemon or hook references `claude-mem` path in scanned tools. Likely written by a `claude-mem` MCP server or CLI tool directly.	Unknown — no caller found in `~/system/tools/` or `~/.claude/hooks/`.	None identified	PARTIAL (data exists, producer and consumer both untraced)
LightRAG	`http://localhost:9621`	Neo4J graph + NanoVectorDB + JsonKV storage; workspace `/app/data`	999 processed docs, 1 failed (pipeline_busy=true, 120 async locks pending — actively ingesting)	`~/.claude/hooks/lightrag-auto-ingest.sh` (PostToolUse: Write/Edit) — fires on writes to `~/.claude/projects/-Users-makinja/memory/.md`, `~/system/specs/.md`, and `/tmp/-bookstack-.md`. Also `com.alai.lightrag-outbox-ingest.plist` daemon.	`discover.js` — primary read path. Queries `https://lightrag.alai.no/query` (external hostname, not localhost). Fallback: if local hits < 3, LightRAG fallback fires.	`com.alai.lightrag-watchdog.plist`, `com.alai.lightrag-keepwarm.plist`, `com.alai.lightrag-backup.plist`, `com.john.lightrag-monitor.plist`, `com.alai.lightrag-migrate-pump.plist`	HEALTHY (serving, ingesting)
HiveDB (SQLite)	`~/system/agents/hivemind/hivemind.db`	7 tables: `agents` (139 rows), `memos` (100 rows), `intel` (17,551 rows), `subscriptions` (6 rows), `_litestream_seq`, `_litestream_lock`, `sqlite_sequence`	17,551 intel rows (NOTE: context memo said 64,889 — live probe shows 17,551; delta likely from live deletions or memo was stale)	`hivemind.js post <agent> <type> <message>` — agents call this CLI to write intel. Also dual-writes embeddings to Qdrant `hivemind` collection (best-effort, fire-and-forget).	`hivemind.js read/query/search` — text search + semantic search (cosine sim against local embeddings or Qdrant). `discover.js` does NOT query HiveDB directly.	`hivemind.js` (stateless CLI, no daemon; called ad-hoc by agents)	HEALTHY
.md auto-memory	`~/.claude/projects/-Users-makinja/memory/`	123 `.md` files (MEMORY.md index + per-topic files + feedback memos + _archive/)	123 files	Claude Code's built-in auto-memory system (native Claude Code feature — writes `.md` files after conversations automatically, not via any explicit hook or daemon). `lightrag-auto-ingest.sh` PostToolUse hook then ingests these into LightRAG when they are written/edited.	CLAUDE.md "Context Loading" section instructs John to `Read` specific files directly. `discover.js memory "<topic>"` is documented as LightRAG-backed (reads LightRAG, not the .md files directly).	Built-in Claude Code (no external daemon)	HEALTHY (write path functional; read path partially bypassed — LightRAG index only 999 docs, not all 123 .md files confirmed ingested)

2. Producer → Consumer Matrix

Producer	Store Written	Consumer	Notes
Claude Code built-in auto-memory	`~/.claude/projects/-Users-makinja/memory/*.md` (123 files)	`lightrag-auto-ingest.sh` hook (secondary producer → LightRAG)	Auto-memory is Claude Code native. The .md write triggers the hook.
`lightrag-auto-ingest.sh` (PostToolUse hook)	LightRAG `http://localhost:9621`	`discover.js` (primary RAG consumer)	Only fires on Write/Edit tool calls to in-scope paths. Does NOT write to mem0.
`com.alai.lightrag-outbox-ingest.plist` daemon	LightRAG	`discover.js`	Batch ingest pipeline for outbox staging
`hivemind.js post` (called by agent tools)	HiveDB SQLite `hivemind.db` + Qdrant `hivemind` collection (dual-write)	`hivemind.js read/query/search` (CLI)	Qdrant `hivemind` = 60,442 vectors; SQLite intel = 17,551 rows — divergence suggests Qdrant has historical vectors beyond current SQLite rows (possibly from bulk migration)
NOBODY	mem0 API (`localhost:9000/add`) — `mem0_john` collection (865 pts), `knowledge` collection (31,274 pts)	NOBODY reads via mem0 API either	WIRE BREAK: mem0_john has 865 facts that were presumably written at some point (possibly during initial mem0 setup / manual population), but no current tool, hook, daemon, or agent calls `POST localhost:9000`. The mem0 API is a running server with no active clients.
NOBODY identified	Chroma `~/.claude-mem/chroma/` (6,584 embeddings)	NOBODY identified	Chroma has data (6,584 embeddings in `cm__claude-mem`) but producer and consumer are both untraced in current tooling. Likely written by a `claude-mem` MCP tool in a previous iteration.
`com.john.session-archiver.plist`	Likely `sessions` Qdrant collection (929 pts)	`discover.js --sessions` (reads sessions SQLite, not Qdrant)	Sessions exist in Qdrant but `discover.js` reads from a local SQLite `sessions` table, not via mem0 or Qdrant API
`rag-router.js learn`	`~/system/databases/flywheel.db` (SQLite: interactions + rag_cache)	`rag-router.js query` (cache-hit path)	Sixth store — flywheel SQLite, not listed in original inventory. Routes: cache → local Ollama → external. Does not touch mem0.

3. SoR Gap Analysis — Duplicated Fact Classes

Fact Class	Stores Containing It	Designated SoR	Derivative / Shadow	Gap / Conflict
Agent intel / decisions	HiveDB `intel` table (17,551 rows) + Qdrant `hivemind` collection (60,442 vectors)	HiveDB SQLite (primary; `hivemind.js` writes here first)	Qdrant `hivemind` (dual-write, best-effort)	60,442 Qdrant vectors vs 17,551 SQLite rows = 3.4x divergence. Qdrant likely contains orphaned vectors from deleted/purged SQLite rows, or a bulk historical migration that wasn't reflected in SQLite. No reconciliation daemon exists.
Session summaries / history	Qdrant `sessions` (929 pts) + likely local session SQLite (referenced by `discover.js`) + `.md` memory files (MEMORY.md index)	Undefined — no explicit SoR designation	All three are partial	`discover.js --sessions` reads SQLite, not Qdrant `sessions`. Who writes Qdrant `sessions`? Untraced.
John's personal facts / preferences	mem0 `mem0_john` collection (865 vectors) + `.md` auto-memory files (123 files) + LightRAG (999 docs, subset overlapping .md files)	Intended SoR: mem0 (`mem0_john`) — but NO active writer. Actual SoR: `.md` files (Claude Code writes here).	LightRAG is downstream derivative of .md files via `lightrag-auto-ingest.sh`	Critical SoR conflict: 865 facts in mem0 are STALE (last written at setup, no ongoing writes). 123 .md files are current. LightRAG is a partial index of .md files. Three stores claim the same fact class with no reconciliation.
Knowledge base / operational docs	mem0 `knowledge` collection (31,274 vectors) + LightRAG (999 docs, BookStack exports) + Chroma (6,584 embeddings)	Undefined	All three parallel	`knowledge` collection in mem0 has 31,274 vectors — largest in mem0, but again no active writer via mem0 API. Origin unknown. Chroma `cm__claude-mem` (6,584) is also an orphan with no identified current writer or reader.
HiveMind broadcast intel	HiveDB `hivemind` Qdrant collection (60,442) + HiveDB SQLite `intel` (17,551)	HiveDB SQLite is the write authority	Qdrant `hivemind` is derivative (dual-write from `hivemind.js`)	No `hivemind` HTTP API exists (confirmed: port 3001 is Drop API). Qdrant `hivemind` is only queryable via `hivemind.js` semantic search CLI, not accessible to other tools.

4. Critical: The .md vs mem0 Wire Break

What was supposed to happen

The architecture assumes mem0 (http://localhost:9000) is the structured personal memory SoR for John. The mem0_john collection exists with 865 facts. The sessions collection has 929 entries. The server is alive and healthy.

What actually happens

Step 1 — .md files are written by Claude Code natively.
Claude Code has a built-in auto-memory feature that writes conversation summaries and facts as .md files into ~/.claude/projects/-Users-makinja/memory/. This is NOT a hook or daemon — it is a built-in Claude Code behavior. No line of code in ~/system/ controls this write.

Step 2 — lightrag-auto-ingest.sh hooks into the .md write.
File: ~/.claude/hooks/lightrag-auto-ingest.sh (PostToolUse on Write/Edit).
This hook detects when a .md file is written to ~/.claude/projects/-Users-makinja/memory/*.md and fires a background curl POST to LightRAG (http://localhost:9621/documents/text). This is the ONLY downstream pipeline from .md files.

Step 3 — mem0 API is never called.
Grep across all of:

~/system/tools/*.js — 0 files call localhost:9000
~/.claude/hooks/*.sh — 0 files call localhost:9000
~/system/daemons/ — not scanned exhaustively but mem0-server plist confirms it's only a server, not a writer
pi-orchestrator.js — the one hit for localhost:9000 is SonarQube (port 9000 collision), not mem0

The exact wire break: There is no POST http://localhost:9000/add call anywhere in the active system. The mem0 server was built and populated (865 facts in mem0_john, 31,274 in knowledge) at some point — likely during initial setup or a one-time migration — but the "auto-write to mem0" integration was never wired into the live pipeline. The lightrag-auto-ingest.sh hook was written instead, routing .md → LightRAG, leaving mem0 as a read-only relic with stale data.

CEO complaint root cause confirmed: "implementation is not ideal — memory writes to .md files instead of mem0" is accurate. The intended SoR (mem0) has no active producer. The actual write path is: Claude Code → .md files → lightrag-auto-ingest.sh → LightRAG. mem0 is running, healthy, and populated with 865+31,274 stale vectors that nobody reads.

HiveDB relationship

HiveDB (hivemind.db) is a SEPARATE concern from personal memory. It is the agent broadcast / intel bus, not John's fact store. However, the Qdrant hivemind collection (60,442 vectors) lives in the same Qdrant instance as mem0_john, creating the appearance of a unified store when it is actually two separate logical systems sharing infrastructure.

5. Store Status Summary

Store	Healthy?	Active Producer?	Active Consumer?	Data Fresh?
mem0 / Qdrant `mem0_john`	Yes	NO	NO	NO — 865 facts, stale
mem0 / Qdrant `knowledge`	Yes	NO	NO	NO — 31,274 vectors, stale
mem0 / Qdrant `sessions`	Yes	Unknown	NO	Unknown
mem0 / Qdrant `hivemind`	Yes	Yes (hivemind.js dual-write)	Yes (hivemind.js semantic search)	YES
HiveDB SQLite	Yes	Yes (hivemind.js CLI)	Yes (hivemind.js CLI)	YES — 17,551 rows
LightRAG	Yes	Yes (lightrag-auto-ingest.sh hook + outbox daemon)	Yes (discover.js)	YES — 999 docs, pipeline busy
Chroma	Yes (file exists)	UNKNOWN	UNKNOWN	Unknown origin
.md auto-memory	Yes	Yes (Claude Code native)	Partial (direct Read + LightRAG index)	YES — 123 files
Flywheel SQLite	Presumed yes	Yes (rag-router.js learn)	Yes (rag-router.js query)	Unknown

Open Questions

Chroma write/read path: Who wrote 6,584 embeddings to ~/.claude-mem/chroma/cm__claude-mem? Which tool or MCP server reads from it? The claude-mem MCP is referenced in settings but not found in scanned tool code. Needs: grep -r "claude-mem\|chroma" ~/.claude/settings.json and MCP server registry audit.
Qdrant sessions writer: Who writes 929 session vectors to the sessions Qdrant collection? com.john.session-archiver.plist is a candidate but the script path was not read. Needs: cat ~/Library/LaunchAgents/com.john.session-archiver.plist + script inspection.
Qdrant knowledge origin: 31,274 vectors in knowledge — when were they written and from what source? No active writer found. Possible: one-time BookStack bulk ingest or a migration. Check ~/system/mem0/server.py for any bulk-load routines at startup.
HiveDB vector divergence: 60,442 Qdrant vectors vs 17,551 SQLite intel rows. Are the extra ~43K vectors orphaned (deleted SQLite rows without Qdrant cleanup), or does Qdrant have independent content? Needs: sample Qdrant payload IDs vs SQLite id column cross-check.
LightRAG external hostname: discover.js queries https://lightrag.alai.no/query (external URL from config), not http://localhost:9621. Is there a Caddy/Cloudflare proxy routing lightrag.alai.no → localhost:9621? If that proxy is down, discover.js would silently fail to read from LightRAG despite the local container being healthy.
mem0_john 865 facts provenance: When were these written? Is there a one-time ingestion script (e.g., ~/system/mem0/populate.py or similar)? If the facts are high-quality (personal preferences, CEO directives), they are the most actionable store to re-wire as the active SoR.
rag-router.js flywheel.db size and health: Not probed live. Needs sqlite3 ~/system/databases/flywheel.db "SELECT count(*) FROM interactions; SELECT count(*) FROM rag_cache;".
mem0 server.py — does it expose /add or /search routes?: Confirmed health endpoint works. Need to verify actual API surface to confirm if a PostToolUse hook calling POST localhost:9000/add would work as-is without code changes to mem0.

Inventory: Tools Shed

Tools Shed Audit — 2026-05-09

Audit Scope: ~/system/tools/ (443 files on disk) Manifest Version: ~/system/tools/manifest-index.md (282 rows, last update 2026-04) Audit Date: 2026-05-09 Auditor: John (Explore Agent, read-only)

Summary

Classification	Count	Pct
LIVE (referenced in daemons/agents/skills/chains)	~250	56.4%
.BAK / .pre- / .deployed*	50	11.3%
JUNK (malformed name, 0-byte, JSON-as-filename)	3	0.7%
DEAD-CODE (no caller, not in manifest LIVE list)	~100	22.6%
UNCLASSIFIED (catalog gaps, unclear status)	~40	9.0%

Total Disk Space: 502 MB (dominated by .venv/ + subdirectory trees)

1. Total Counts by Classification

Live Tools (ACTIVE status in manifest or active daemon references)

Count: ~250 tools Source: manifest-index.md lists 201 ACTIVE entries (pre-2026-04), plus ~49 tools in daemons/ that were added post-manifest update.

Top-tier LIVE tools (by size):

mc.js (250 KB) — Mission Control CLI, last modified 2026-05-08 ✓ CURRENT
mc-dashboard.js (170 KB) — dashboard, last modified 2026-04-06
manifest.md (94 KB) — full manifest (separate from manifest-index.md)
auto-report.js (51 KB) — daily/weekly report generator
slack-bot.js (49 KB) — Slack daemon
invoice-generator.js (48 KB) — invoice CRUD
event-handlers.js (46 KB) — event dispatch
mail-native.js (40 KB) — IMAP/SMTP fallback

Backup Files (.bak, .pre-, .deployed)

Count: 50 files Location Clusters:

_archive/2026-04/ — 20 files (manifest.md, mc.js, qa-19.js, event-handlers.js, comms-responder.js variants, kimi-*, youtube-learning, slack-bot.js variants, rag-context-for-builder.js, resource-governor.js)
Root level — 30 files (autocoder.js.pre-azure-cutover-20260419, lightrag*.pre-azure-cutover, mc.js.bak-* variants, comms-, council-, mini-da, ollama-, prompt-tester, rag-, retrieval-orchestrator.pre-, system-regression.pre-, transcript-, vector-)

Age Analysis (sample):

Mar 07–14, 2026 (52 days old) — oldest: resource-governor.js.bak, kimi-server.sh.bak, kimi-monitor.js.bak
Apr 02, 2026 (37 days old) — mc.js.bak-aaos-20260402
Apr 10–20, 2026 (19–29 days old) — most common, pre-azure-cutover-* batch (highest density)
Apr 30, 2026 (9 days old) — bulk-dated backup cluster (appears to be organized archive pass)

All .bak files are > 14 days old. Safe for archival per planning assumptions.

Junk Findings

3 malformed/suspect filenames identified:

Credential-bearing JSON-as-filename artifact (0 bytes)
- Created: 2026-02-24 06:39
- Issue: LITERAL JSON object with test credentials embedded as filename
- SECURITY RISK: Credentials (passwords, tokens, keys) encoded in filesystem path
- Source: Appears to be tool output-capture error (shell process writing object serialization instead of text)
- Recommendation: DELETE immediately + audit all tools for output-capture leaks + add alai-hooks gate
.alai/context-index.db-wal (inside tools/)
- Zero-byte WAL journal file
- Not a proper tool — appears to be SQLite write-ahead log (orphaned)
- Recommendation: DELETE
alai-hooks/.gradle/ subdirectories
- Gradle cache files (0-byte metadata: gc.properties, REQUESTED markers)
- Inside alai-hooks/ (Java/Kotlin project)
- Not tools — system detritus
- Recommendation: purge from /tools/ to /archive/, keep only alai-hooks source

Zero-byte files: Multiple .REQUESTED, .lock, gc.properties inside Python venv — expected (pip metadata). Not tools.

2. Manifest Drift Analysis

Manifest Entries Scanned: 282 rows (manifest-index.md)

Cross-reference results:

Status	Count	Notes
Exists on disk	~250	All LIVE/ACTIVE referenced tools present
DELETED in manifest, absent from disk	31	Expected (deleted per manifest Sprint 2/3, 2026-02-26)
Referenced in manifest but ARCHIVED	6	docuseal-monitor.js, docuseal-webhook.js, blueprint-runner.js, blueprint-compose.js, etc. — moved to ~/system/archive/replaced-by-n8n-2026-02/
Manifest lists as ACTIVE but STALE (>30d)	~8	intel-briefing.js (Apr 6), council-briefing.js (pre-extract), ollama-workers/* (last mod Mar–Apr)
Subdirectory tools NOT in manifest	~40–60	`comms-agent/`, `browser-use-explorer/`, `alai-hooks/` internal tools (Kotlin, TypeScript, Python) — not catalogued
MANIFEST MISSING entries	15–20	Post-2026-04 additions (tier-router, skill-router, claim-detector, mini-da, drift-detector, tool-sync-audit, tool-dedup-report, multi-client routing, agent-metrics-api, agent-timeout-monitor)

Drift Conclusion: Manifest is ~6 weeks stale. 201 ACTIVE tools documented; ~250–300 actually running (50–100 undocumented, mostly post-Feb architectural shifts + sub-agent frameworks).

3. Un-owned LIVE Tools

Tools referenced in daemons or .md but NOT explicitly claimed in manifest ACTIVE list:

Tool	Caller	Owner (inferred)	Status
tier-router.js	agent-runner.js, task-router.js	(unassigned)	LIVE, no owner
skill-router.js	mc.js, plan-enforcer	(unassigned)	LIVE, no owner
claim-detector.js	cove.js, drift-detector	(unassigned)	LIVE, no owner
claim-verifier.js	cove.js, qa-19.js	(unassigned)	LIVE, no owner
drift-detector.js	daemon (daily 23:55)	(unassigned)	LIVE, daemon-run
tool-sync-audit.js	daemon (daily 03:00)	(unassigned)	LIVE, daemon-run
tool-dedup-report.js	daemon (Monday 06:00)	(unassigned)	LIVE, daemon-run
agent-metrics-api.js	agent-orchestrator.js	(unassigned)	LIVE, endpoint
agent-timeout-monitor.js	agent-runner.js	(unassigned)	LIVE, daemon-enforcer
ollama-workers/* (4 tools)	automation (referenced in session-archiver)	(unassigned)	LIVE, utilities
forge-status.js	studio-health.js, emergency-repl	(unassigned)	LIVE
studio-health.js	ops-watchdog, ollama-engine	(unassigned)	LIVE

Implication: 12+ mission-critical tools lack explicit owner/status in manifest. Creates risk of accidental deprecation/orphaning.

4. Stale .bak Files (>14 days old)

All 50 .bak/* files are > 14 days old and safe for archival:

Oldest Batch (52 days; safe to archive):

resource-governor.js.bak-20260310-184907 (Mar 10)
kimi-server.sh.bak-20260313-181327 (Mar 13)
kimi-monitor.js.bak-20260313-181327 (Mar 13)
youtube-learning.js.bak-20260316-084904 (Mar 16)
event-handlers.js.bak.20260314-043322 (Mar 14)
ollama-tool-agent.js.bak-20260316-234508 (Mar 16)
qa-19.js.bak.20260314-043322 (Mar 14)
mc.js.bak.20260314-043322 (Mar 14)
mc.js.bak.20260310-184105 (Mar 10)

Mid-range (37 days):

mc.js.bak-aaos-20260402 (Apr 2)
mc.js.bak-before-7082-7085 (Apr 2)
health-monitor-anvil.js.bak (Apr 6)
intel-briefing.js.bak (Mar 31)

Recent Batch (9 days; organized archive pass, Apr 30):

_archive/2026-04/* (20 files, all Apr 30 11:25:48)

Recommendation: Move all .bak/* to dated subdirectory (e.g., _archive/2026-05/pre-may/), ZIP for offsite backup.

5. Additional Junk & Quality Findings

Missing Expected Files

Files referenced in manifest but NOT found on disk:

(None critical; all listed DELETED files were already absent per manifest notes)

Suspicious Dead Code

Tool	Symptom	Recommendation
`element-test.js` (114 KB)	No daemon/agent caller, appears test-only	Verify if part of active testing suite or orphaned
`durable-executor.js` (59 KB)	Shadowed by durable-runner.js; unclear distinction	Check if both needed or consolidate
`youtube-learning.js.bak` (backup preserved)	Original .bak exists; unknown if active service	Verify if YouTube integration still used
`resource-governor.js.bak` (backup preserved)	Resource control tool; backed up mid-March	Check if resource-governor.js ever went live

Subdirectories with Nested Tools (Not in Manifest)

~/system/tools/comms-agent/              (TypeScript/Node monorepo)
  src/, dist/          (telegram-handler.ts, index.js with .bak variants)
  package.json, tsconfig.json
  Status: ??? (unclear if actively deployed vs. dev artifact)

~/system/tools/browser-use-explorer/     (Python + Node, 1.2 GB)
  .venv/lib/python3.12/site-packages/   (pip deps only, not code)
  src/, package.json
  Status: ??? (research tool? dev sandbox?)

~/system/tools/alai-hooks/               (Kotlin/Java, binary CLI)
  gradle/, src/        (Kotlin security enforcement, codesigned binary)
  Status: ACTIVE (referenced in mc.js, alai-hooks command used in hooks)
  Note: Gradle .gradle/ cache should be archived

Finding: 3 subdirectories (80+ MB combined) are not documented in manifest. Unclear which are active, which are dev/research.

6. Top-10 Largest Tools

Rank	Tool	Size	Last Modified	Status
1	browser-use-explorer/	320 MB	Apr 28	??? (venv=280MB)
2	comms-agent/	45 MB	Apr 1	??? (node_modules=40MB)
3	alai-hooks/	12 MB	May 6	ACTIVE (Kotlin binary)
4	mc.js	250 KB	May 8	LIVE
5	mc-dashboard.js	170 KB	Apr 6	LIVE
6	manifest.md	94 KB	Apr 14	Reference doc
7	auto-report.js	51 KB	Apr 24	LIVE
8	pipeline-controller.js	58 KB	Feb 26	LIVE
9	slack-bot.js	49 KB	Apr 6	LIVE
10	invoice-generator.js	48 KB	Feb 17	LIVE

Observation: Single .py + .venv project (browser-use-explorer) consumes 63% of ~/system/tools/ disk (320 MB).

If research/PoC only: move to ~/projects/ or ~/backups/
If production: document in manifest + verify active daemon

7. Live References — Tool Coverage

Tool consumer analysis (sample grep):

Consumer	Count	Examples
~/system/daemons/	42 scripts	mc-session-worker.sh, email-agent.js, ops-watchdog.js, flywheel-cycle.sh, auto-* (8), daemon-* (5), etc.
*~/.claude/agents/.md**	28 files	builder.md, validator.md, resolver.md, linter.md, etc. — each requires 5–10 tools
~/.claude/skills/	80+ skills	Each skill loads ~2–5 tools on demand (via skill-runner.js)
*~/system/agents/chains/.yaml**	23 chains	Each chain references 1–3 tools for orchestration
*~/.claude/hooks/.sh**	12 hooks	alai-hooks gating, process enforcement, mc claims

Live tool hit count: ~250–280 tools have explicit caller references.

Open Questions

browser-use-explorer/: Is this an active production tool or a research sandbox? If research, should live in ~/projects/. 320 MB allocation is significant.
comms-agent/ subdirectory: Is this a stable deployed service or in-flight TypeScript migration? .bak variants suggest evolution.
alai-hooks/ binary codesigned: Latest mod 2026-05-06; clearly active. Should .gradle/ cache be cleaned or preserved?
50 .bak files: Do we need all 50, or is a rotating keep-last-3-per-tool strategy viable?
Manifest staleness: Should manifest-index.md be auto-refreshed daily (e.g., daemon that re-scans daemons/ + agents/ + chains/) to stay in sync?
12 un-owned tools: Should each be assigned explicit owner + manifest entry, or grouped under "Deterministic Enforcement" or "Agent Infrastructure"?
JSON-as-filename security: When created? Which tool? Did credentials leak to logs? Recommend grep of all logs for exposed secrets.

Recommendations (Audit-Level Only)

CRITICAL

Delete malformed filename immediately: Filename contains embedded credentials. Audit tools/, daemons/, and agents/* for output-capture leaks. Add alai-hooks gate to prevent future output-as-filename incidents.
Security review of JSON filename artifact:
- When was it created? (2026-02-24)
- Which tool created it? (Bash tool capture?)
- Did credentials leak to logs? (Grep logs for exposed patterns)
- Add validation layer to prevent credentials-in-paths
Document or relocate browser-use-explorer/:
- If active: add to manifest, assign owner, set LaunchAgent
- If research: move to ~/projects/ or archive, free 320 MB

HIGH

Refresh manifest-index.md:
- Add 50–60 undocumented post-Feb tools (tier-router, skill-router, claim-, drift-detector, tool-sync-audit, agent-metrics-api, agent-timeout-monitor, ollama-workers/, forge-status, studio-health)
- Assign ownership: which persona (CodeCraft, FlowForge, Proveo, Securion)?
- Set explicit LIVE vs. ARCHIVED vs. DEPRECATED status
Archive all .bak files:
- Create ~/system/archive/2026-05-09-bak-sweep/ (ZIP friendly)
- Move 50 .bak* files
- Update manifest with archive location + retention policy
Clarify comms-agent/ status:
- If deployed: verify daemon + manifest entry
- If migration: set deadline for TypeScript cutover or rollback

MEDIUM

Define tool ownership:
- Create manifest section: "Infrastructure Owner Assignments"
- Assign: tier-router, skill-router, claim-, drift-detector, tool-, agent-metrics-api, agent-timeout-monitor → explicit team
Automate manifest refresh:
- Create daemon: ~/system/daemons/manifest-refresh.js
- Daily 04:00: scan daemons/, agents/, chains/ → auto-update manifest-index.md
- Hook into mc.js add-tool proposal flow
Standardize .bak naming:
- Policy: max 3 backups per tool, naming = <tool>.<date>.<hash>.bak
- Daemon: daily cleanup of excess backups
Consolidate durable-executor vs. durable-runner:
- Verify both needed; if not, mark one DEPRECATED + migrate callers

Audit Confidence

Area	Confidence	Notes
Backup file count + age	HIGH	All 50 .bak files enumerated, dates verified
Junk file identification	HIGH	JSON-as-filename caught, 0-byte files confirmed
LIVE tool hit count	MEDIUM	Sampled grep coverage; not exhaustive scan of all 443 files
Manifest drift	HIGH	Manifest explicitly marked "2026-02-26" audit; 6+ weeks stale confirmed
Subdirectory status	LOW	comms-agent/ and browser-use-explorer/ require interactive verification
Un-owned tools	MEDIUM	12 inferred from daemon/skill references; could miss some

Audit completed: 2026-05-09 21:15 UTC Auditor: John (Explore Agent) Next step: Escalate critical findings (malformed filename, manifest refresh) to CEO/Mehanik.

Inventory: Agent Fleet

Agent Fleet Inventory — SENTINEL Audit 2026-05-09

Auditor: sentinel-architect
Scope: ~/.claude/agents/ vs specialist-mapping.json vs persona dirs vs chains vs definitions dual-store
Status: READ-ONLY. No files modified.

1. 66 vs 29 vs 12 Reconciliation

Raw counts (tool-verified)

Store	Count	Notes
`~/.claude/agents/*.md`	66	Includes 0.md, Explore.md, Plan.md as named agents
`specialist-mapping.json` mappings	29	Key: `mappings` object
`specialist-mapping.json` companies	9	ALAI, AgentForge, CodeCraft, Finverge, FlowForge, Proveo, Securion, Skybound, Vizu
Persona dirs in `~/system/agents/personas/`	12	AgentForge, Axiom, CodeCraft, Datavera, Finverge, FlowForge, Lexicon, Proveo, Resolver, Securion, Skybound, Vizu

Critical gap: 3 persona companies are completely absent from specialist-mapping.json:

Axiom — not in company_summary, zero agents mapped
Datavera — not in company_summary, zero agents mapped
Resolver — not in company_summary, zero agents mapped
Lexicon — not in company_summary, zero agents mapped (persona dir exists, skillforge.md maps to "Skillforge" not Lexicon)

So the real company gap is 4 out of 12 personas have no presence in specialist-mapping.json.

Mapped agents (29 in specialist-mapping.json)

Agent file	Company	On disk (~/.claude/agents/)?
alem-clone.md	ALAI	MISSING
angie-jones.md	Proveo	YES
anthropic-chief-architect.md	AgentForge	MISSING
brad-frost.md	Vizu	YES
bruce-momjian.md	CodeCraft	YES
builder.md	CodeCraft	YES
chip-huyen.md	AgentForge	YES
claude-code-guide.md	AgentForge	YES
codecraft.md	CodeCraft	YES
dorota-huizinga.md	Proveo	MISSING
georgi-gerganov.md	AgentForge	YES
hadi-hariri.md	CodeCraft	MISSING
james-bach.md	Proveo	MISSING
kelsey-hightower.md	FlowForge	YES
lea-verou.md	Vizu	YES
lee-robinson.md	CodeCraft	MISSING
lisa-crispin.md	Proveo	MISSING
markos-zachariadis.md	Finverge	YES
martin-kleppmann.md	CodeCraft	YES
parisa-tabriz.md	Securion	YES
paul-hudson.md	Skybound	YES
petter-graff.md	CodeCraft	YES
proveo.md	Proveo	YES
sentinel-architect.md	Securion	YES
sentinel-ba.md	Skybound	YES
sentinel-developer.md	CodeCraft	YES
sentinel-tester.md	Proveo	YES
sentinel-validator.md	Proveo	YES
skillforge.md	Skillforge	YES

7 agents mapped in specialist-mapping.json but MISSING from ~/.claude/agents/:

alem-clone.md — exists in definitions/, not synced to ~/.claude/agents/
anthropic-chief-architect.md — NOT in definitions/ either; completely phantom
dorota-huizinga.md — exists in definitions/, not synced
hadi-hariri.md — exists in definitions/, not synced
james-bach.md — exists in definitions/, not synced
lee-robinson.md — exists in definitions/, not synced
lisa-crispin.md — exists in definitions/, not synced

anthropic-chief-architect.md is the worst case: mapped in specialist-mapping.json, NOT in definitions/, NOT in ~/.claude/agents/ — fully phantom, cannot be dispatched.

42 unmapped agents (in ~/.claude/agents/ but NOT in specialist-mapping.json)

Classification: ORPHAN = nowhere used | DUPLICATE = covered by mapped peer | NEEDS-MAPPING = used in chains/skills but unmapped

Agent	Classification	Reasoning
`0.md`	ORPHAN	No name, no description, artifact
`agentforge.md`	NEEDS-MAPPING	Company persona file; Axiom/Datavera/Resolver equivalents all exist — AgentForge has a persona dir but no company-level mapping entry
`backend-builder.md`	DUPLICATE	Covered by builder.md (CodeCraft, mapped)
`backend-dev.md`	DUPLICATE	Covered by codecraft.md + builder.md
`baseline-comparator.md`	NEEDS-MAPPING	Active agent (Veritas baseline, MLX-backed); used in verify-fix-loop skill; no mapping
`code-reviewer.md`	DUPLICATE	Covered by petter-graff.md / sentinel-developer.md
`code-simplifier.md`	DUPLICATE	Covered by sentinel-developer.md
`database-dev.md`	DUPLICATE	Covered by bruce-momjian.md
`datavera.md`	NEEDS-MAPPING	Company persona file for Datavera (persona dir exists, 0 mapped agents)
`design-builder.md`	DUPLICATE	Covered by brad-frost.md / lea-verou.md
`devils-advocate.md`	NEEDS-MAPPING	Pre-action blocker used in 0 chain yamls but referenced in mehanik flow; unregistered
`devops-dev.md`	DUPLICATE	Covered by kelsey-hightower.md
`distiller.md`	NEEDS-MAPPING	Used in 21 chain yaml steps (highest after builder/validator); no mapping. CRITICAL gap.
`dr-sarah-chen.md`	ORPHAN	No description parsed; no chain/skill references found
`dzevad-jahic.md`	NEEDS-MAPPING	Bosnian linguistic QA (Lexicon company, per CLAUDE.md); not in specialist-mapping.json despite CLAUDE.md routing directive
`evidence-verifier.md`	NEEDS-MAPPING	Active Veritas agent (gemma-4-26b @ FORGE); triggers on mc.js done for H tasks; no mapping
`Explore.md`	ORPHAN	Capital E; appears to be a stub
`finverge.md`	NEEDS-MAPPING	Company persona file for Finverge; persona dir mapped but no company-level agent entry
`fix-builder.md`	NEEDS-MAPPING	Write-only counterpart to verifier; used in verify-fix-loop skill; no mapping
`flowforge.md`	NEEDS-MAPPING	Company persona file for FlowForge; only kelsey-hightower.md individual is mapped
`frontend-builder.md`	DUPLICATE	Covered by lea-verou.md / lee-robinson.md
`frontend-dev.md`	DUPLICATE	Covered by lea-verou.md
`fullstack-dev.md`	DUPLICATE	Covered by codecraft.md
`helixsupport.md`	ORPHAN	Role=coordinator; 0 skill/chain references found
`indy-dandev.md`	ORPHAN	AI research agent (Indian AI + Dan Abramov persona); no chain/skill references; not used in current system
`integration-dev.md`	DUPLICATE	Covered by codecraft.md
`jake-wharton.md`	NEEDS-MAPPING	Android/Kotlin expert (Jake Wharton persona); no AgentForge/Skybound mapping entry
`lexicon.md`	NEEDS-MAPPING	Company persona file for Lexicon (documentation company per CLAUDE.md); 0 agents in specialist-mapping.json
`maria-santos.md`	ORPHAN	No description parsed; no chain/skill references found
`mehanik.md`	NEEDS-MAPPING	Core orchestration gate; referenced in 7 skill files; CLAUDE.md cites /mehanik command as mandatory pre-dispatch gate; completely absent from specialist-mapping.json
`meta-agent.md`	ORPHAN	No chain/skill references found
`Plan.md`	ORPHAN	Capital P; appears to be a stub
`proxima.md`	NEEDS-MAPPING	Marketing/content agent; referenced in 10 skill files; no company assignment
`rag-builder.md`	ORPHAN	No chain/skill references; likely superseded by AgentForge rag-tuning-agent.yaml
`redzo-reviewer.md`	ORPHAN	No chain/skill references found
`resolver.md`	NEEDS-MAPPING	Company persona for Resolver (persona dir exists, 8 internal agents; 0 in specialist-mapping.json)
`securion.md`	NEEDS-MAPPING	Company persona for Securion; parisa-tabriz.md + sentinel-architect.md individually mapped, but no company-level dispatcher
`skybound.md`	NEEDS-MAPPING	Company persona for Skybound; individual members mapped but no company dispatcher
`thaer-sabri.md`	ORPHAN	No description parsed; no chain/skill references found
`validator.md`	NEEDS-MAPPING	Used in 44 skill files and 22 chain yaml steps; one of the most-used agents in the entire system; NOT in specialist-mapping.json. CRITICAL gap.
`verifier.md`	NEEDS-MAPPING	2 skill file references; verify-fix-loop skill; not mapped
`vizu.md`	NEEDS-MAPPING	Company persona for Vizu; brad-frost.md + lea-verou.md individually mapped, no company dispatcher

Summary of 42 unmapped:

ORPHAN: 10 (0.md, dr-sarah-chen.md, Explore.md, helixsupport.md, indy-dandev.md, maria-santos.md, meta-agent.md, Plan.md, rag-builder.md, redzo-reviewer.md, thaer-sabri.md) — wait, 11 counting redzo
Actually: 0.md, dr-sarah-chen.md, Explore.md, helixsupport.md, indy-dandev.md, maria-santos.md, meta-agent.md, Plan.md, rag-builder.md, redzo-reviewer.md, thaer-sabri.md = 11 ORPHAN
DUPLICATE: backend-builder.md, backend-dev.md, code-reviewer.md, code-simplifier.md, database-dev.md, design-builder.md, devops-dev.md, frontend-builder.md, frontend-dev.md, fullstack-dev.md, integration-dev.md = 11 DUPLICATE
NEEDS-MAPPING: 20 (agentforge, baseline-comparator, datavera, devils-advocate, distiller, dzevad-jahic, evidence-verifier, finverge, fix-builder, flowforge, jake-wharton, lexicon, mehanik, proxima, resolver, securion, skybound, validator, verifier, vizu)

Note: counts = 11+11+20 = 42. The original "37 unmapped" figure understates by 5 because it excludes alem-clone.md (mapped but disk-missing) and overcounts mapped agents that are actually absent.

2. Persona Dirs Deep Dive

All 12 persona dirs have a consistent structure: agents/, blueprints/, brand/, CLAUDE.md, company.json, config.json, legal/, ops/, README.md, skills/, state/, tools/.

Persona	Has README	Has CLAUDE.md	Has company.json	Agents inside (count)	Owner in company.json	In specialist-mapping.json
AgentForge	YES	YES	YES (domain: AI)	8	N/A	Partial (3 individuals mapped, no company dispatcher)
Axiom	YES	YES	YES (domain: ARCHITECTURE)	5	N/A	NO — completely absent
CodeCraft	YES	YES	YES (domain: DEVELOPMENT)	8	N/A	Partial (6 individuals mapped)
Datavera	YES	YES	YES (domain: DATA)	8	N/A	NO — completely absent
Finverge	YES	YES	YES (domain: FINANCE)	9	N/A	Partial (1 individual mapped)
FlowForge	YES	YES	YES (domain: DEVOPS)	10	N/A	Partial (1 individual mapped)
Lexicon	YES	YES	YES (domain: DOCUMENTATION)	9	N/A	NO — skillforge.md maps to "Skillforge" not Lexicon
Proveo	YES	YES	YES (domain: QA)	8	N/A	Partial (6 individuals mapped)
Resolver	YES	YES	YES (domain: SYSTEMIC)	8	N/A	NO — completely absent
Securion	YES	YES	YES (domain: SECURITY)	8	N/A	Partial (2 individuals mapped)
Skybound	YES	YES	YES (domain: PRODUCT)	7	N/A	Partial (2 individuals mapped)
Vizu	YES	YES	YES (domain: DESIGN)	7	N/A	Partial (2 individuals mapped)

Structural finding: All company.json files report owner: N/A. No human/agent owner is recorded for any virtual company. This means there is no machine-readable way to route escalation or accountability.

Persona vs mapping mismatch:

87 total agents inside persona dirs (sum of agent subdirs across 12 companies) — none of these internal PI agents (builder.yaml, lead.yaml, reviewer.yaml, etc.) appear in specialist-mapping.json. specialist-mapping.json only tracks the "celebrity" individual agents, not the PI agent swarms inside each company.

3. Chain Coverage

Agents referenced in chains

Agent	Times referenced in chains	In specialist-mapping.json?	Disk present?
builder	25	YES	YES
validator	22	NO	YES
distiller	21	NO	YES
sentinel-validator	9	YES	YES
minion	5	NO	NOT in ~/.claude/agents/ (in definitions/ only)
planner	4	NO	NOT in ~/.claude/agents/ at all

Critical: minion and planner are referenced in chains but have NO corresponding .md in ~/.claude/agents/.

minion.md exists in ~/system/agents/definitions/ but was never synced forward
planner does not exist in definitions/ or ~/.claude/agents/ — it is a phantom agent referenced in 3 chains (plan-build.yaml, plan-build-review.yaml, plan-review-plan.yaml)

Dead chains (0 references anywhere in skills/ or system/)

Chains that are never invoked via skills or daemons:

Chain	Skill refs	Verdict
codecraft-api-backend.yaml	0	DEAD
codecraft-nextjs-app.yaml	0	DEAD
full-review.yaml	0	DEAD
minion-bugfix.yaml	0	DEAD
minion-docs.yaml	0	DEAD
minion-one-shot.yaml	0	DEAD
minion-refactor.yaml	0	DEAD
minion-security-fix.yaml	0	DEAD
plan-build-review.yaml	0	DEAD
plan-build.yaml	~1 (plan-build-test skill ref)	BORDERLINE
plan-review-plan.yaml	0	DEAD
scout-flow.yaml	0	DEAD
securion-security-review.yaml	0	DEAD

Note: The skill-*.yaml chains in the chains/ dir are not invoked by name in skills/. They appear to be template definitions, not live dispatch chains. Chains are not invoked via a chain runner — skills embed agents directly via agent: field inline. The chain YAML format appears to be an aspirational DAG definition language that has no runtime executor wired up.

Effectively ALL 35 chain YAMLs are dead — there is no chain runner in the skill system. Skills call agents directly, not via chain files.

4. Dual-Store Consistency

Files in both ~/.claude/agents/ and ~/system/agents/definitions/

48 files exist in both stores. ALL 48 are byte-for-byte SYNCED (diff returned empty for every shared file). The sync script at ~/bin/agent-definitions-sync.sh is working correctly for the files it covers.

Sync gaps

16 files ONLY in ~/.claude/agents/ (not in definitions/) — not covered by sync:

baseline-comparator.md
claude-code-guide.md
devils-advocate.md
dr-sarah-chen.md
dzevad-jahic.md
evidence-verifier.md
Explore.md
fix-builder.md
indy-dandev.md
jake-wharton.md
maria-santos.md
mehanik.md
Plan.md
redzo-reviewer.md
thaer-sabri.md
verifier.md

8 files ONLY in definitions/ (not synced to ~/.claude/agents/) — these agents are UNREACHABLE by Claude Code:

dorota-huizinga.md    ← mapped in specialist-mapping.json, should be in ~/.claude/agents/
hadi-hariri.md        ← mapped in specialist-mapping.json, should be in ~/.claude/agents/
james-bach.md         ← mapped in specialist-mapping.json, should be in ~/.claude/agents/
lee-robinson.md       ← mapped in specialist-mapping.json, should be in ~/.claude/agents/
lisa-crispin.md       ← mapped in specialist-mapping.json, should be in ~/.claude/agents/
minion.md             ← referenced in 5 chain yaml steps, unreachable
sentry-code-simplifier.md  ← not in mapping, not in chains
sp-code-reviewer.md   ← not in mapping, not in chains

The first 5 are mapped and therefore expected to be dispatched — they cannot be. Any dispatch attempt for dorota-huizinga, hadi-hariri, james-bach, lee-robinson, or lisa-crispin will silently fail or fall back.

5. Skill → Agent Linkage

Sample of 10 skills with agent dispatch analysis:

Skill	Agent referenced	Agent in ~/.claude/agents/?	In specialist-mapping.json?
hop-build	No sub-agent dispatch (marker-only skill)	N/A	N/A
build	`builder` (3 parallel), `rag-context-for-builder.js` (tool)	YES	YES
code-review	`code-reviewer`, `securion` sub-agent, `sentinel-architect`	code-reviewer YES (unmapped), securion YES (unmapped dispatcher), sentinel-architect YES (mapped)
debugging	No agent dispatch found in instructions	N/A	N/A
deploy-verify	No agent (runs Playwright directly)	N/A	N/A
design-system	No agent dispatch	N/A	N/A
doc-coauthoring	No named agent dispatch	N/A	N/A
fiken-agent	Self-referential meta-skill; dispatches sub-task SKILL.md files	Indirect	N/A
financial-overview	No agent dispatch found	N/A	N/A
incident-response	References `securion` agent (remediation)	securion.md YES (unmapped dispatcher)	NO

Flags:

code-review skill dispatches code-reviewer (unmapped, 44 skill refs) and securion (unmapped company dispatcher) directly by name
incident-response references securion as a response agent — but securion.md is NOT in specialist-mapping.json (only individual members are mapped)
validator is the most-used agent (44 skill files, 22 chain steps) with NO mapping entry

Open Questions

Chain runner: Is there a chain executor anywhere in the system (~/system/tools/, ~/projects/, pi-orchestrator)? If not, the entire chains/ directory is documentation-only, not executable automation.
planner agent: Referenced in 3 chains (plan-build, plan-build-review, plan-review-plan) but does not exist on disk anywhere. Was it renamed to distiller or mehanik?
Axiom, Datavera, Resolver: Three fully-formed virtual companies with persona dirs, README, CLAUDE.md, 5-8 internal agents each — but zero presence in specialist-mapping.json. Are these active companies being used via direct session invocation (not via John routing)?
anthropic-chief-architect.md: Mapped in specialist-mapping.json, absent from both ~/.claude/agents/ AND definitions/. Was this agent removed intentionally or is it a sync failure?
company.json owner=N/A: All 12 companies have no human owner. Is there a separate ownership registry, or is this a gap in accountability chain?
Lexicon vs Skillforge naming: CLAUDE.md routing table names the company "Lexicon" and lists "Dževad Jahić" as its agent. specialist-mapping.json has skillforge.md mapping to company "Skillforge". These are two different names for what appears to be the same documentation company. Which is canonical?
~/.claude/agents/*.md priority: Claude Code loads subagents from ~/.claude/agents/. The definitions/ store is a backup. But 8 mapped agents live only in definitions/ and are therefore unreachable. Is ~/bin/agent-definitions-sync.sh being run on any schedule?

Architectural Concerns (no auto-fix)

A. Mapping covers only 29 of 66 agents (44%) — the layer is too thin to be a reliable routing table.
The specialist-mapping.json is supposed to be John's source of truth for "who builds this?" routing. But the two highest-usage agents in the entire system (validator with 44 skill refs, distiller with 21 chain refs) are absent. Routing decisions based on this file are structurally incomplete.

B. 7 mapped agents unreachable at runtime.
Agents marked as mapped (specialist-mapping.json claims them) but missing from ~/.claude/agents/ will fail silently when dispatched. The mapping implies reachability but does not enforce it. No health check validates the mapping → disk correspondence.

C. The chain YAML layer has no executor.
35 chain YAML files define multi-step agent pipelines, but skills invoke agents directly by name — not via the chain files. The chains/ directory is a documentation artifact, not live infrastructure. All automation currently runs through inline skill → agent calls. This creates a documentation drift risk: chain files will diverge from actual behavior with no mechanism to detect it.

D. 4 virtual companies are phantom — infrastructure without routing.
Axiom, Datavera, Resolver, Lexicon each have: persona dir, README, CLAUDE.md, company.json, 5-9 internal agents. None appear in specialist-mapping.json or John's routing table. They consume disk and cognitive space but cannot be dispatched through the normal John → discover.js → specialist route. Direct session invocation (naming the company in a prompt) is the only access path — undocumented and unreliable.

E. Dual-store sync is manual and partial.
16 agents exist only in ~/.claude/agents/ (single source of truth but no backup). 8 agents exist only in definitions/ (backed up but unreachable). The sync script does not auto-run; it must be manually invoked. This creates continuous drift pressure.

F. planner is a phantom agent in live chains.
Three chains reference an agent named planner that has no .md file anywhere on disk. If these chains were ever executed, planner steps would fail with no error at the mapping layer.

G. No machine-readable owner for any virtual company.
company.json owner: N/A across all 12 companies means there is no way to auto-route escalation, billing, or accountability. This is a governance gap, not a code gap.

Inventory: Daemon Fleet

AI Factory Daemon Fleet Audit — 2026-05-09

Auditor: kelsey-hightower
Timestamp: 2026-05-09T20:48 UTC
Source of truth: launchctl list + daemon-fleet-status.json (generated 2026-05-09T18:33:52Z) + plist reads + error log sampling
Fleet size (watchdog): 148 tracked entries | 47 running keepalive | 74 calendar_ok | 3 down | 20 erroring
Fleet size (launchctl live): 168 rows matching alai/john/no.alai pattern (includes daemons not in watchdog)

1. Live Exit-Code Matrix

Column key: PID (- = not running) | Last Exit | Plist location | KeepAlive policy | Schedule

1a. RUNNING (keepalive, PID alive, exit 0 or -15/SIGTERM)

Daemon	PID	Exit	Plist Path	KeepAlive	Schedule
com.alai.agent-timeout-monitor	1163	0	system/daemons/launchagents	always	continuous
com.alai.cc-api-server	1183	0	system/daemons/launchagents	always	continuous
com.alai.credit-monitor	1223	0	system/daemons/launchagents	always	continuous
com.alai.idle-learning-daemon	1196	0	system/daemons/launchagents	always	continuous
com.alai.litestream	51452	0	Library/LaunchAgents	always	continuous
com.alai.mem0-server	65706	-15 (SIGTERM)	Library/LaunchAgents	always	continuous
com.alai.mlx-gemma4	27321	0	(not in known dirs)	always	continuous
com.alai.mlx-qwen25-coder-32b	31120	0	(not in known dirs)	always	continuous
com.alai.mlx-qwen3-32b	29227	0	(not in known dirs)	always	continuous
com.alai.mlx-qwen3-8b	29488	0	(not in known dirs)	always	continuous
com.alai.ollama-serve-v2	29100	0	system/daemons/launchagents	always	continuous
com.alai.orchestrator-bridge	1185	0	system/daemons/launchagents	always	continuous
com.alai.ram-monitor	1241	0	system/daemons/launchagents	always	continuous
com.alai.task-router	1200	0	system/daemons/launchagents	always	continuous
com.alai.web-learning	1176	0	system/daemons/launchagents	always	continuous
com.john.bookstack-webhook-relay	1206	0	system/daemons/launchagents	always	continuous
com.john.browser-worker	1211	0	system/daemons/launchagents	always	continuous
com.john.caddy-vault	86082	0	system/daemons/launchagents	always	continuous
com.john.cloudflared	79617	0	system/daemons/launchagents	always	continuous
com.john.comms-agent	1186	0	system/daemons/launchagents	always	continuous
com.john.documenso-webhook	20561	0	system/daemons/launchagents	always	continuous
com.john.durable-executor	1212	0	system/daemons/launchagents	always	continuous
com.john.edita-loop	61758	0	system/daemons/launchagents	always	continuous
com.john.email-agent	92225	0	system/daemons/launchagents	calendar	calendar
com.john.email-tracker	11292	0	system/daemons/launchagents	conditional	conditional
com.john.event-dispatcher	65452	0	system/daemons/launchagents	always	continuous
com.john.health-dashboard	1189	0	system/daemons/launchagents	always	continuous
com.john.hook-daemon	1240	0	system/daemons/launchagents	always	continuous
com.john.intake-watcher	41929	0	system/daemons/launchagents	always	continuous
com.john.kenan-hot-web	1231	0	system/daemons/launchagents	always	continuous
com.john.llm-datasette	1170	0	system/daemons/launchagents	always	continuous
com.john.mc-dashboard	65673	0	system/daemons/launchagents	always	continuous
com.john.n8n	1203	0	system/daemons/launchagents	always	continuous
com.john.network-watchdog	1194	0	system/daemons/launchagents	always	continuous
com.john.ops-watchdog	8782	-15 (SIGTERM)	system/daemons/launchagents	always	continuous
com.john.outbox-processor	1190	0	system/daemons/launchagents	always	continuous
com.john.paste-logger	1224	0	system/daemons/launchagents	always	continuous
com.john.pi-orchestrator	75750	0	system/daemons/launchagents	always	continuous
com.john.slack-bot	18046	1 (last crash exit)	system/daemons/launchagents	always	continuous
com.john.tender-dashboard	1234	0	system/daemons/launchagents	always	continuous
com.john.tool-shed	1191	0	system/daemons/launchagents	always	continuous
com.john.vault-keeper	87005	0	system/daemons/launchagents	always	continuous
com.john.vault-proxy	1222	0	system/daemons/launchagents	always	continuous
com.john.youtube-nightly-learning	83439	0	system/daemons/launchagents	always	continuous
no.alai.claude-proxy	6361	0	Library/LaunchAgents	always	continuous
com.alai.rag-drain-worker	3640	1 (prev exit)	system/config/launchagents	always	continuous
com.alai.rag-fsevents-adapter	64755	1 (prev exit)	system/config/launchagents	conditional	WatchPaths
com.alai.daemon-fleet-watchdog	2815	0	(Library/LaunchAgents)	calendar	every 15min

1b. DOWN — Exit 0 (intentional one-shot or conditional)

Daemon	PID	Notes
com.john.autocoder-ui	-	down_exit_0: one-shot complete
com.john.draft-sender	-	down_exit_0: conditional, no pending drafts
com.john.orchestrator-http	-	down_exit_0: DUPLICATE — orchestrator-bridge runs same script on port 3052

1c. CALENDAR SCHEDULED — Exit 0 last run (healthy)

These fired successfully on last scheduled run. Not exhaustively listed — watchdog confirms 74 in this state.
Key members: com.alai.apply-knowledge, com.alai.archive-first-scan, com.alai.chain-weekly-report, com.alai.docker-watchdog, com.alai.gcloud-auth, com.alai.john-daily-digest, com.alai.lightrag-backup, com.alai.memory-watchdog, com.alai.meta-agent-loop, com.alai.restore-drill, com.alai.skill-audit, com.alai.team-sync, com.alai.wal-checkpoint, com.alai.weekly-planning, com.alai.zombie-cleanup, com.john.agentforge, com.john.bookstack-sync, com.john.calendar-bridge, com.john.critical-tools-healthcheck, com.john.daemon-health, com.john.db-archival-sweep, com.john.db-backup, com.john.domain-audit, com.john.drift-detector, com.john.email-briefing, com.john.forge-watchdog, com.john.log-rotate, com.john.mc-session-worker, com.john.morning-routine, com.john.offsite-backup, com.john.pi2-override-audit, com.john.review-drain, com.john.session-archiver, com.john.session-extractor, com.john.spam-recovery-scan, com.john.system-guardian, com.john.tldr-actionizer, com.john.tldr-briefing, com.john.tldr-watch, com.john.tldr-weekly-synthesis, com.john.weekly-synthesis, no.alai.email-body-integrity, no.alai.meta-agent, no.alai.resolver, no.alai.spend-guard.

1d. FAILING — Non-zero exit codes

Daemon	PID	Exit Code	Plist Location	KeepAlive	Schedule
com.alai.azure-db-backup	-	1 (exit 256 internal)	system/config/launchagents	none (RunAtLoad=false)	every 4h
com.alai.blueprint-fleet-watchdog	-	1 (exit 256)	Library/LaunchAgents	none	daily 06:15
com.alai.cert-expiry-monitor	-	1 (exit 256)	system/config/launchagents	none	daily 07:00
com.alai.chain-daily-inbox	-	1 (exit 256)	Library/LaunchAgents	none	daily 07:00
com.alai.chain-e2e-nightly	-	1 (exit 256)	Library/LaunchAgents	none	daily 02:00
com.alai.chain-phantom-detector	-	1 (exit 256)	Library/LaunchAgents	none	every 15min
com.alai.cost-daily-report	-	127	Library/LaunchAgents	none	daily 23:55
com.alai.daily-planning	-	127	Library/LaunchAgents	none	daily 07:30
com.alai.filesystem-audit	-	1 (exit 256)	Library/LaunchAgents	none	Monday 08:00
com.alai.pi-orch-health	-	127	Library/LaunchAgents	none	daily 23:00
com.alai.rag-bookstack-adapter	-	1 (exit 256)	system/config/launchagents	none	every 5min
com.alai.rag-drain-worker	3640	1 (prev exit, now running)	system/config/launchagents	always	continuous
com.alai.rag-fsevents-adapter	64755	1 (prev exit, now running)	system/config/launchagents	conditional	WatchPaths
com.alai.rag-mc-adapter	-	1 (exit 256)	system/config/launchagents	none	every 5min
com.alai.rdap-audit-quarterly	-	2	Library/LaunchAgents	none	quarterly
com.john.alaiml-retrain	-	1 (exit 256)	system/config/launchagents + Library/LaunchAgents	none	1st of month 03:00
com.john.auto-verify-regression	-	1 (exit 256)	system/daemons/launchagents	none	daily 06:00
com.john.b2-offsite-backup	-	1 (exit 256)	system/daemons/launchagents	none	daily 03:30
com.john.bookstack-staleness	-	1 (exit 256)	system/daemons/launchagents	none	Sunday 22:00
com.john.infra-drift-detector	-	1 (exit 256)	system/daemons/launchagents	none	Sunday 04:00
com.john.legal-docs-azure-sync	-	127	Library/LaunchAgents	Crashed=true	daily 02:00
com.john.lightrag-monitor	-	2	system/config/launchagents	none	daily 09:00
com.john.mcp-health-check	-	127	Library/LaunchAgents	Crashed=true	every 1h
com.john.slack-bot	18046	1 (last crash)	system/daemons/launchagents	always	continuous

1e. NOT LOADED (watchdog knows them, launchctl does not)

Daemon	State
com.alai.lightrag-migrate-pump	not_loaded
com.alai.lightrag-outbox-ingest	not_loaded
com.alai.lightrag-watchdog	not_loaded
com.john.rdap-audit-quarterly	not_loaded

2. Failure Cohort — Root Cause Analysis

EXIT 127 — Script/binary not found (BROKEN — script deleted)

These five daemons have plists in Library/LaunchAgents pointing to scripts that no longer exist on disk. Exit 127 is bash's "command not found" — the script path itself is gone.

Daemon	Missing Script	Last Successful Run	Category
com.alai.pi-orch-health	`~/system/tools/pi-orch-health.sh`	2026-05-06 (verdict: CRITICAL)	BROKEN
com.alai.cost-daily-report	`~/system/tools/cost-daily-report.sh`	2026-04-29	BROKEN
com.alai.daily-planning	`~/system/tools/daily-planning.sh`	unknown	BROKEN
com.john.legal-docs-azure-sync	`~/system/daemons/legal-docs-azure-sync.sh`	unknown	BROKEN
com.john.mcp-health-check	`~/system/tools/mcp-health-check.sh`	unknown	BROKEN

Note on legal-docs-azure-sync and mcp-health-check: Both have KeepAlive.Crashed=true, meaning launchd will restart them on crash. Since they always exit 127, they are in a guaranteed restart loop (throttled). This wastes process spawns indefinitely.

EXIT 1 / 256 — Script exists but fails at runtime (BROKEN — dependency missing)

Daemon	Script	Root Cause	Category
com.alai.rag-bookstack-adapter	`rag-bookstack-adapter.js`	Queue depth 946 > 500 backpressure gate — never drains because drain-worker cannot reach LightRAG	BROKEN (cascade)
com.alai.rag-drain-worker	`rag-drain-worker.js`	Vaultwarden ETIMEDOUT → CF credentials unavailable → LightRAG unreachable	BROKEN
com.alai.rag-mc-adapter	`rag-mc-adapter.js`	Same backpressure cascade, queue depth 946	BROKEN (cascade)
com.alai.rag-fsevents-adapter	`rag-fsevents-adapter.js`	Queue depth >500 backpressure, runs but skips all enqueues	BROKEN (cascade)
com.alai.azure-db-backup	`azure-db-backup.sh`	`az storage blob upload` SIGTERM'd (line 116); temp dirs leaked in /tmp	TRANSIENT
com.alai.cert-expiry-monitor	`cert-expiry-monitor.sh`	Script exists, no error log found — likely network/curl failure	TRANSIENT
com.alai.chain-daily-inbox	`chain-runner.sh --enqueue daily-inbox-triage`	chain-runner.sh exists; failure likely in downstream chain execution	TRANSIENT
com.alai.chain-e2e-nightly	`chain-e2e-nightly.sh`	Script exists; likely Playwright/network dependency failure	TRANSIENT
com.alai.chain-phantom-detector	`phantom-link-detector.js`	Script does NOT exist on disk — MISSING	BROKEN
com.alai.filesystem-audit	`~/bin/anvil-audit.sh`	Script exists; last exit 256 may be diff/rename limit warning elevated to exit	TRANSIENT
com.alai.blueprint-fleet-watchdog	`~/system/daemons/blueprint-fleet-watchdog.js`	Script exists; likely a missing dep or API auth failure	TRANSIENT
com.john.alaiml-retrain	`~/ALAI/internal/projects/alaiML/scripts/retrain.sh`	Script exists; DUPLICATE plist (both config and Library/LaunchAgents); likely venv path or MC dep failure	BROKEN (duplicate)
com.john.auto-verify-regression	`auto-verify-regression.js`	Script exists; calls `claim-verifier.js` — probable missing dep or API failure	TRANSIENT
com.john.b2-offsite-backup	`b2-offsite-backup.sh`	B2 storage cap EXCEEDED (403 storage_cap_exceeded) and auth token limit errors	BROKEN (infra)
com.john.bookstack-staleness	`bookstack-staleness.js`	API parse error "Unexpected end of JSON input" on page 2553+ — BookStack API truncating responses	BROKEN
com.john.infra-drift-detector	`infra-drift-detector.sh`	`diff.renameLimit` warning elevated to non-zero exit; git rename detection failing on large repos	TRANSIENT
com.john.slack-bot	`(node process)`	WebSocket pong timeouts (ETIMEDOUT); process alive and heartbeating, but launchd saw a crash exit	TRANSIENT

EXIT 2 — Logic/health failure

Daemon	Script	Root Cause	Category
com.alai.rdap-audit-quarterly	plist not found in known dirs	Script path unknown, likely MISSING	BROKEN
com.john.lightrag-monitor	`lightrag-health-with-alert.sh`	Script exits 1/2 when LightRAG is degraded — this is INTENTIONAL ALERTING behavior, but LightRAG IS degraded	EXPECTED (alarm correctly firing)

3. Producer-Consumer Wiring

RAG Ingest Pipeline (currently DEADLOCKED)

com.alai.rag-fsevents-adapter   watches ~/system/evidence, ~/system/specs, ~/system/rules
com.alai.rag-bookstack-adapter  polls BookStack API every 5min
com.alai.rag-mc-adapter         reads ~/system/logs/mc-task-outcomes.jsonl
  --> all three WRITE to ~/system/state/ingest-queue.sqlite (queue depth: 946, frozen)

com.alai.rag-drain-worker (keepalive) reads ingest-queue.sqlite
  --> attempts POST to https://lightrag.basicconsulting.no (via CF Access)
  --> CF credentials lookup: Vaultwarden ETIMEDOUT (bw-session stale or vault unreachable)
  --> LightRAG unreachable → queue never drains → backpressure locks all three producers

ORPHAN OUTPUT: ~/system/metrics/ingest_pipeline.prom written by rag-drain-worker
  --> nothing confirmed reading this file (no Prometheus scrape config found in audit)

This is the single most critical broken pipeline in the factory. 946 items queued, zero being processed.

Memory / Knowledge Layer

com.alai.mem0-server (PID 65706, keepalive)
  reads/writes: http://localhost:6333 (Qdrant vector store)
  produces: REST API on localhost:9000 (port cslistener)
  consumed by: discover.js, agent tools calling /v1/memories
  STATUS: alive and healthy (health 200, Qdrant 200)
  NOTE: exit -15 (SIGTERM) in launchctl = prior graceful restart; current run is clean

com.alai.litestream (PID 51452, keepalive)
  reads: SQLite DBs in ~/system/state/ (flywheel.db, health-events.db, etc.)
  writes: B2 bucket alai-studio-backup (replication stream)
  STATUS: running but b2-offsite-backup.sh (separate) hitting B2 storage cap

com.alai.wal-checkpoint (calendar, exit 0)
  reads/writes: SQLite WAL files in ~/system/state/
  consumed by: litestream (clean WAL = cleaner replication)

Orchestration Kernel

com.john.pi-orchestrator (PID 75750, keepalive)
  reads: Planka MC API (boards.basicconsulting.no per mock config)
  writes: ~/system/logs/pi-orchestrator/daemon-*.log
  STATUS: running, cycling every 30s, "No eligible tasks" — running in MOCK MODE
  NOTE: alai-config-mock.json loaded; real config resolver likely not resolving

com.alai.orchestrator-bridge (PID 1185, keepalive)
  runs: orchestrator-http-server.js on port 3052
  produces: HTTP API for triggering orchestrator actions
  STATUS: running healthy

com.john.orchestrator-http (down_exit_0)
  DUPLICATE of orchestrator-bridge — same script, same port (3052)
  Watchdog says down_exit_0: port already bound by bridge when this tried to start
  ORPHAN: plist in Library/LaunchAgents, shadow of orchestrator-bridge

Backup Layer

com.john.b2-offsite-backup (calendar, exit 1)
  reads: ~/system/state/ SQLite snapshots
  writes: B2 bucket alai-studio-backup
  STATUS: BLOCKED — B2 storage cap exceeded (403)

com.alai.azure-db-backup (calendar, exit 1)
  reads: Azure SQL databases (via az CLI)
  writes: ~/system/daemons/azure-db-backup.sh → Azure Blob Storage
  STATUS: TRANSIENT failures, az upload SIGTERM'd (timeout in script or process kill)
  ORPHAN TEMP: /tmp/az-backup-* directories leaking (rm fails on non-empty dirs)

Comms / Slack

com.john.slack-bot (PID 18046, keepalive)
  reads: Slack WebSocket (socket-mode)
  writes: Slack messages, ~/system/logs/slack-bot.log
  STATUS: alive, heartbeating, WebSocket reconnects successfully (~once per session)
  CONCERN: 300min silent (no incoming Slack messages received in 5h as of audit time)

no.alai.email-body-integrity (calendar, exit 0)
  reads: IMAP one.com (email body verification)
  writes: ~/system/logs/email-integrity.log
  STATUS: healthy last run

Monitoring / Health

com.john.lightrag-monitor (calendar, exit 2)
  reads: LightRAG API health endpoint
  writes: /tmp/lightrag-task-context.json, ~/system/evidence/lightrag-health-*.md
  STATUS: correctly reporting LightRAG as degraded; Slack alert delivery ALSO failing
  ORPHAN OUTPUT: lightrag-health-*.md files accumulating in ~/system/evidence/
    (rag-fsevents-adapter trying to enqueue these — but queue full — circular feedback)

com.alai.daemon-fleet-watchdog (PID 2815, every 15min)
  reads: launchctl list, all plist dirs
  writes: ~/system/state/daemon-fleet-status.json
  STATUS: healthy, data current as of 18:33:52Z today

com.alai.pi-orch-health (calendar, exit 127)
  was: reads pi-orchestrator state, writes ~/system/state/pi-orch-health-*.json
  STATUS: BROKEN — script deleted. Last known verdict (2026-05-06): CRITICAL

MLX / Inference Layer

com.alai.mlx-gemma4 (PID 27321)
com.alai.mlx-qwen3-32b (PID 29227)
com.alai.mlx-qwen3-8b (PID 29488)
com.alai.mlx-qwen25-coder-32b (PID 31120)
com.alai.ollama-serve-v2 (PID 29100)
  STATUS: all running (keepalive), exit 0
  PRODUCES: inference endpoints on ANVIL (local)
  Note: plists not found in audited dirs — loaded from unknown location (possibly ~/Library/LaunchAgents subdirs)

4. Critical-Path Daemon Assessment

com.john.pi-orchestrator

PID: 75750 | Exit: 0 | Status: RUNNING
Healthy? Process is alive and cycling every 30s. However, it is running in MOCK MODE (alai-config-mock.json). The config resolver is not resolving real service URLs (Planka localhost:3100 is not listening per MEMORY.md). "No eligible tasks" every cycle.
Produces: Cycle logs to ~/system/logs/pi-orchestrator/daemon-stdout.log
Consumes: MC/Planka API (currently mocked, not reaching real board)
Verdict: Process alive but effectively IDLE. Not orchestrating anything. Mock mode = silent failure.

com.alai.pi-orch-health

PID: - | Exit: 127 | Status: BROKEN
Root cause: ~/system/tools/pi-orch-health.sh was deleted. Script ran last on 2026-05-06 with verdict CRITICAL. Now permanently broken until script is restored.
Produces: ~/system/state/pi-orch-health-*.json (last written 2026-05-06)
Verdict: BROKEN — monitoring of the orchestrator kernel has gone dark.

com.alai.mem0-server

PID: 65706 | Exit: -15 (prior SIGTERM) | Status: ALIVE AND HEALTHY
Root cause of -15: launchctl records the exit code of the previous run; the current process (PID 65706) started clean. SIGTERM was a graceful restart, not a crash.
Evidence: Port 9000 listening (lsof confirmed), /health returns 200, Qdrant at localhost:6333 returns 200.
Note: /v1/memories returning 404 — API route may have changed or not yet initialized.
Verdict: ALIVE. Exit -15 is misleading — current instance is healthy.

com.john.lightrag-monitor

PID: - | Exit: 2 | Status: EXPECTED ALARM
Root cause: Script correctly exits non-zero when LightRAG is degraded. LightRAG IS degraded (drain-worker cannot reach it due to missing CF credentials). Slack alert also failing (alert delivery broken).
Produces: ~/system/evidence/lightrag-health-*.md, /tmp/lightrag-task-context.json
Verdict: Monitor itself is working correctly. The degradation it reports is real and severe.

com.alai.lightrag-keepwarm

PID: - | Exit: 0 | Status: calendar_ok
Plist location: ~/Library/LaunchAgents/com.alai.lightrag-keepwarm.plist
Schedule: unknown (plist content not captured in this audit — found late)
Produces: Keepwarm pings to LightRAG
Verdict: Last run exited 0. Likely the keepwarm pings succeed against the local endpoint even while drain-worker cannot auth through CF Access. Not broken.

com.alai.archive-first-scan

PID: - | Exit: 0 | Status: calendar_ok | Schedule: daily 06:00
Script: ~/bin/archive-first-scan.sh — EXISTS
Produces: /tmp/archive-first-scan-report-<date>.txt, writes to ~/system/state/archive-first-ledger.jsonl
Consumes: Filesystem scan of unarchived candidates
Verdict: HEALTHY. Running as designed.

com.john.session-archiver

PID: - | Exit: 0 | Status: calendar_ok | Schedule: daily 03:00
Script: ~/system/tools/session-archiver.js — EXISTS (10928 bytes, 2026-02-23)
Produces: Cleaned-up session artifacts
Consumes: Claude session logs/state
Verdict: HEALTHY. Last run clean.

com.alai.cost-daily-report

PID: - | Exit: 127 | Status: BROKEN | Schedule: daily 23:55
Root cause: ~/system/tools/cost-daily-report.sh deleted. Last successful run 2026-04-29.
Produces: ~/system/reports/cost-daily.md
Consumes: Cost tracker data
Verdict: BROKEN — daily cost visibility dark for 10 days.

com.alai.weekly-planning

PID: - | Exit: 0 | Status: calendar_ok | Schedule: Tuesday 08:00
Script: ~/system/tools/weekly-planning.sh — MISSING from disk
BUT watchdog says last exit was 0 and state is calendar_ok. Contradiction.
Likely explanation: Ran successfully before script was deleted; launchd has not triggered it since (last Tuesday before deletion date). Will fail as exit 127 next Tuesday.
Verdict: TICKING TIME BOMB — will fail next Tuesday 08:00.

no.alai.email-body-integrity

PID: - | Exit: 0 | Status: calendar_ok | Schedule: daily 03:00
Script: ~/system/tools/email-body-integrity-check.js — EXISTS
Produces: ~/system/logs/email-integrity.log
Verdict: HEALTHY.

5. Daemon-Fleet-Watchdog State

File: ~/system/state/daemon-fleet-status.json
Generated: 2026-05-09T18:33:52Z (approx 2h15m before this audit)

Watchdog summary from file:

total:       148
running:      47 (keepalive processes alive)
calendar_ok:  74 (last scheduled run exit 0)
down:          3 (down_exit_0: autocoder-ui, draft-sender, orchestrator-http)
err:          20 (non-zero exit codes)

Watchdog accuracy notes:

Watchdog correctly identifies 20 erroring daemons but exit codes are internally translated (256 = bash exit 1; 32512 = bash exit 127).
Watchdog does NOT cover all 168 launchctl rows — 4 daemons marked not_loaded (lightrag-migrate-pump, lightrag-outbox-ingest, lightrag-watchdog, rdap-audit-quarterly).
com.alai.mem0-server shows last_exit: 15 (SIGTERM of prior instance) but state: running — correct, the current instance is healthy.
com.john.slack-bot shows running/pid 18046 but last_exit: 256 — launchd records last crash before current keepalive restart. Process is currently alive.

Open Questions

Pi-orchestrator mock mode: Why is alai-config-mock.json being loaded instead of real config? Is the Planka/MC API intentionally offline, or is the config resolver broken? The orchestrator is spinning idle.
LightRAG CF credentials: Vaultwarden ETIMEDOUT in rag-drain-worker. Is /tmp/bw-session stale? Is Vaultwarden (vault.basicconsulting.no) reachable? This single broken auth is deadlocking the entire RAG ingest pipeline (946 items queued).
B2 storage cap: 403 storage_cap_exceeded on Backblaze B2. Is this a billing cap that needs to be raised in the B2 console? Litestream is still replicating but the nightly snapshot job fails.
Five deleted scripts: Who deleted pi-orch-health.sh, cost-daily-report.sh, daily-planning.sh, legal-docs-azure-sync.sh, mcp-health-check.sh? Were they intentionally removed (deprecated)? If deprecated, the plists should be unloaded. If accidental deletion, restore from backup.
Duplicate alaiml-retrain plist: Plist exists in BOTH system/config/launchagents AND Library/LaunchAgents. Two crons would fire. Which is canonical?
com.john.orchestrator-http duplicate: Identical to com.alai.orchestrator-bridge (same script, same port). orchestrator-http shows down_exit_0 because bridge already bound the port. Dead plist.
LightRAG health-*.md circular feedback: The lightrag-monitor evidence files are being watched by rag-fsevents-adapter, which tries to enqueue them into LightRAG — a monitoring artifact feeding back into the broken pipeline it monitors.
Slack bot silent 300 min: No incoming Slack messages for 5h at audit time. Is anyone sending messages? Or is the Socket Mode token scope broken for receiving?

Highest-Leverage Fix Candidates (audit-level only)

Priority 1 — Unlocks entire RAG pipeline (946 items unblocked)

Fix rag-drain-worker CF Access credentials: ensure Vaultwarden item "LightRAG-CF-Access" exists and /tmp/bw-session is valid. One credential fix unblocks bookstack-adapter + mc-adapter + fsevents-adapter simultaneously.

Priority 2 — Restore cost visibility (10-day blind spot)

Restore or recreate ~/system/tools/cost-daily-report.sh. Last output was 2026-04-29. CEO-visible reporting dark for 10 days.

Priority 3 — Fix orchestrator mock mode

Determine why pi-orchestrator loads mock config. If Planka/MC API is down, restore it. If config resolver is broken, fix alai-config.js. The orchestration kernel is running but doing nothing.

Priority 4 — Raise B2 storage cap

B2 bucket alai-studio-backup has hit its cap. Nightly database snapshots are not landing. This is a billing action in the Backblaze console, not a code fix.

Priority 5 — Unload dead plists (5 scripts deleted)

com.alai.pi-orch-health, com.alai.cost-daily-report, com.alai.daily-planning, com.john.legal-docs-azure-sync, com.john.mcp-health-check should either have scripts restored or be unloaded from launchd. legal-docs-azure-sync and mcp-health-check have KeepAlive.Crashed=true creating infinite restart loops.

Priority 6 — Unload com.john.orchestrator-http duplicate plist

Dead shadow of orchestrator-bridge. Causes confusion in watchdog counts.

Priority 7 — Restore weekly-planning.sh before next Tuesday

Script missing but plist active. Will fail exit 127 at 08:00 next Tuesday.

Priority 8 — Fix phantom-link-detector.js missing script

com.alai.chain-phantom-detector runs every 15min calling a script that does not exist. High-frequency failure (96 times/day).

Verifier Autonomy Audit

AI Factory Audit — Plan Task 2.2: Verifier Autonomy

Date: 2026-05-09 Auditor: Martin Kleppmann (CodeCraft) Classification: AUDIT-ONLY — read-only, no mutation, no live invocation

VERDICT SUMMARY (up front)

Autonomy verdict: ABSENT

The /verify-fix-loop skill is fully specified and internally consistent, but it has zero wiring into any automated trigger path. CEO is the de-facto verifier for every task that reaches mc.js ready. The skill exists only as a manually-invoked slash command.

1. End-to-End Trace of `/verify-fix-loop`

Source: ~/.claude/skills/verify-fix-loop/SKILL.md

Flow map

Caller (John / human) invokes: /verify-fix-loop mc_id=<N> spec_path=<path>
    │
    ▼
SKILL orchestrates in main conversation thread (not a sub-agent itself)
    │
    ├─ mkdir -p /tmp/verify-fix-loop-<mc_id>/    (EVIDENCE_DIR)
    │
    ▼
LOOP (max 3 iterations):
    │
    ├─ Step A: Task(subagent_type=verifier OR general-purpose+persona)
    │     prompt = verifier brief template (inline in SKILL.md)
    │     verifier writes: EVIDENCE_DIR/verifier-loop<N>.md  (mandatory)
    │                       /tmp/verifier-feedback-<mc_id>.md (if CONFIDENCE=FEEDBACK)
    │
    ├─ Step B: Parse STATUS + CONFIDENCE from verifier output
    │
    ├─ Step C: Branch
    │     PERFECT / VERIFIED → write SUMMARY.md (SUCCESS), exit
    │     PARTIAL            → if high_stakes: ESCALATE; else: SUCCESS_WITH_NOTES, exit
    │     FAILED             → ESCALATE (harness broken)
    │     FEEDBACK:
    │         if high_stakes or budget exhausted → ESCALATE
    │         else →
    │
    ├─ Step D: Task(subagent_type=fix-builder OR general-purpose+persona)
    │     reads /tmp/verifier-feedback-<mc_id>.md
    │     applies prescribed edits to spec_path via Edit tool
    │     returns APPLIED:<N> / PARTIAL:<N>/<M> / COULD_NOT_APPLY:<reason>
    │
    └─ LOOP_INDEX += 1 → back to Step A

Domain escalation policy

docs, system, refactor, polish — loops up to MAX_LOOPS (default 3)
security, finance, legal, deploy, infra, unknown — ESCALATE on first FEEDBACK (no autonomous correction)

Loop budget

Default MAX_LOOPS = 3
Hard cost cap: $5 per skill invocation
Per-loop cost estimate: $0.40–0.60 (Sonnet)
Worst case: 3 × $0.60 = $1.80

Termination conditions

CONFIDENCE in {PERFECT, VERIFIED} → SUCCESS
CONFIDENCE == PARTIAL + not high_stakes → SUCCESS_WITH_NOTES
Budget exhausted (LOOP_INDEX == MAX_LOOPS with FEEDBACK) → ESCALATE
High-stakes domain with FEEDBACK on first iteration → ESCALATE
Any FAILED confidence → ESCALATE (harness broken)
fix-builder returns COULD_NOT_APPLY → ESCALATE
MC status changes to done/cancelled mid-loop → ABORT silently
Cost estimate exceeds $5 → ESCALATE before next iter

Entry points (who can call this)

The SKILL.md lists trigger phrases: "verify-fix-loop", "auto-verify and fix", "verifier loop", "ne idi preko mene", "loop until pass". All trigger phrases are designed for human invocation in a conversation. No programmatic entry points exist.

2. Auto-Invocation Analysis — The Central CEO Question

pi-orchestrator.js

Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in ~/system/kernel/pi-orchestrator.js.

The orchestrator's post-completion flow (reportCompletion function, lines ~3781–3930) does:

Hallucination detection (regex-based detectHallucination)
Proof-of-work check (GOTCHA file or response length)
qa-19 Check #20 (endpoint verification, if configured)
Postflight marker write to ~/system/state/postflight-cleared-<id>.json

None of these steps call the verifier, fix-builder, or verify-fix-loop skill. The "postflight" referenced in pi-orchestrator is a file marker write, NOT the /task-postflight skill.

task-postflight skill

Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in ~/.claude/skills/task-postflight/SKILL.md.

The /task-postflight skill dispatches Angie Jones (Proveo) for AC-checklist QA, not the atomic-claim verifier. These are parallel, non-overlapping verification patterns:

Proveo = human-readable AC checklist with pass/fail verdicts per item
Verifier = atomic claim decomposition with machine-verified proof citations

Hooks directory

Grep result: Only archive files matched. No active hook in ~/.claude/hooks/ references verify-fix-loop, verifier, or fix-builder.

Active hooks audited:

liveness-claim-validator.sh — PostToolUse on Write/Edit; checks for bare liveness claims in memory/spec/agent files. Not related to verifier dispatch.
mc-ready-gate.sh — wrapper for mc.js ready; runs ZAKON #30 direct-probe gate + evidence-contract-validator. Does NOT invoke verify-fix-loop.
evidence-contract-validator.sh — validates verdict JSON schema + sha256 chain. Shell-based, no agent dispatch.
cross-session-claim-gate.sh, session-task-lock-gate.sh, plan-completeness-gate.sh, pre-dispatch-gate.sh — none reference verifier.

Daemon fleet

Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in ~/system/daemons/.

LaunchAgents

Grep result: ZERO matches in ~/Library/LaunchAgents/.

VERDICT: ABSENT

The verify-fix-loop and its constituent agents (verifier, fix-builder) have zero automated entry points. The only invocation path is a human typing a trigger phrase in a Claude Code conversation. CEO is always in the loop because there is no loop without CEO.

3. Tool-Surface Security Check

Verifier (read-only)

Definition file: ~/.claude/agents/verifier.md Declared tools: tools: Read, Grep, Glob, Bash

The tools: field includes Bash. This is the critical point.

The agent definition does NOT use a tool whitelist that removes Write/Edit/Task at the API level. It relies entirely on prompt-level enforcement ("Enforcement is prompt-only — this rule is yours to honor. You are the gatekeeper."). The verifier.md explicitly states this.

Permitted Bash commands (per prompt whitelist in verifier.md):

cat, head, tail, wc, ls, file, stat
diff, git read-only subcommands
grep, rg, find (via tool preferred)
jq, node -e (read-only expression)
node ~/system/tools/mc.js show (read-only subcommands only — NEVER add|start|done|ready|update|pause|cancel)
gh pr view, gh issue view, gh api -X GET
sqlite3 -readonly, psql SELECT only
curl -sI (HEAD), curl -s GET (never POST/PUT/DELETE)
bash -n, shellcheck, node --check (dry-run linters)

Escape paths documented:

The prompt says "NEVER run: rm, mv, cp (to non-/tmp/), chmod, chown, ln" and "Redirections that write outside /tmp/verifier-* or /tmp/<task_id>-evidence/: >, >>, tee to other paths".
This is prompt-level enforcement only. A model following instructions could still run bash -c "echo foo > ~/system/some-file.txt" — the agent framework does not block it at the API tool-call level.
The tools: Bash declaration gives the agent full shell access; the prompt whitelist is self-enforced.
Feedback file writes are permitted to /tmp/verifier-feedback-<TASK_ID>.md specifically.

Verdict on verifier tool isolation: Prompt-enforced, not API-enforced. Read-only is a behavioral constraint, not a structural constraint. The risk is manageable for a trusted model, but not cryptographically bounded.

Fix-builder (write-only, scoped)

Definition file: ~/.claude/agents/fix-builder.md Declared tools: tools: Read, Edit, Grep, Glob

The fix-builder tool list explicitly excludes:

Write (no new file creation)
Bash (no test runs, deploys, builds, git ops)
Task (no further dispatch)

This is stronger isolation than the verifier: the tools: field at the agent definition level excludes Bash and Write. If the agent framework enforces declared tools as a whitelist, fix-builder genuinely cannot run shell commands or create new files. It can only read existing files (Read, Grep, Glob) and apply edits to existing files (Edit).

Gap: Fix-builder cannot create new files even when feedback prescribes it. The skill handles this: "If the feedback prescribes creating a new file, mark that fix as COULD_NOT_APPLY" — the loop escalates. This is a by-design limitation, not a bug.

Verdict on fix-builder tool isolation: Structurally scoped (Bash and Write excluded from tools declaration). This is the correct pattern. The verifier should be refactored to match this approach.

4. Synthetic Dry-Trace

Selected task: MC #99389 — "Refactor /mehanik skill to progressive-disclosure pattern" (status: review, owner: pi-orchestrator)

This task was marked mc.js ready (now review) after pi-orchestrator completed it.

What WOULD have happened if /verify-fix-loop were auto-invoked:

Step 0: trigger fired when pi-orchestrator called mc.js ready #99389
         → /verify-fix-loop mc_id=99389 spec_path=~/.claude/skills/mehanik/SKILL.md
            domain=docs (inferred from skill file path)
            max_loops=3

Step A (iter 1): dispatch verifier
  - verifier reads ~/.claude/skills/mehanik/SKILL.md
  - verifier reads MC #99389 ACs via mc.js show 99389
  - verifier decomposes ACs into atomic claims:
      (a) SKILL.md exists and is < N lines (tier-1 constraint)
      (b) references/agent-brief.md exists
      (c) references/failure-modes.md exists
      (d) Skill tool callable post-refactor
  - verifier probes each atom with Read/Glob/Bash

Step B: parse CONFIDENCE
  If all files exist and SKILL.md is within limits → PERFECT → SUCCESS
  If any reference file missing → FEEDBACK
  
Step D (if FEEDBACK): dispatch fix-builder
  - fix-builder reads /tmp/verifier-feedback-99389.md
  - applies Edit to create missing sections or correct line counts
  
Step C (iter 2): re-verify → likely PERFECT → write SUMMARY.md → SUCCESS

Actual closure path used for MC #99389: The task is in review status. Looking at the review queue (25+ tasks in review), there is no evidence of verifier invocation. The closure path was: pi-orchestrator marked ready → task sits in review queue → CEO/John is the implicit reviewer. This is the CEO-as-verifier pattern the CEO wants to eliminate.

5. Comparison with Existing Patterns

liveness-claim-validator.sh

Trigger: PostToolUse hook, fires on every Write/Edit/MultiEdit tool call
Scope: Memory files, spec files, agent definition files matching 4 path patterns
Mechanism: Shell script reads tool input JSON from stdin, scans written content for bare liveness claims, blocks write if violations found (exit 2)
Auto-invoked: YES, unconditionally, at the Claude Code hook level
Why verify-fix-loop is NOT similarly hooked: The liveness validator is a passive scan that reads content already being written. The verify-fix-loop requires active agent dispatch (spawning sub-agents), which cannot be done from a shell hook. Shell hooks can block tool calls; they cannot spawn conversational agents.

This is the fundamental architectural gap: hooks can intercept tool calls synchronously, but spinning up a verify-fix-loop requires an async agent conversation that the hook system cannot initiate.

evidence-verifier agent

File: ~/.claude/agents/evidence-verifier.md Declared tools: (not in scope of this read — but confirmed the agent exists) Auto-invoked: YES, but differently — it is called by mc-ready-gate.sh via the evidence-contract-validator.sh pathway. However, the evidence-contract-validator.sh is a pure shell script that validates JSON schema + file hashes — it does NOT dispatch the evidence-verifier agent. The agent definition exists for manual invocation. The shell script performs a deterministic (non-LLM) validation that is auto-invoked at mc.js ready time.

Pattern difference: The evidence-verifier pattern uses a shell script as the auto-invoke layer (deterministic, no LLM), with the agent definition as a fallback for edge cases. The verify-fix-loop requires LLM reasoning at every step, making shell-script auto-invocation insufficient.

6. Gap Analysis and Fix Proposal (Audit-Level Only)

Root cause of the gap

The verify-fix-loop was designed top-down as a skill (manual invocation). The liveness-claim-validator was designed bottom-up as a hook (automatic). There is no bridge layer that translates "mc.js ready event" → "spawn verify-fix-loop conversation".

The missing component is a postflight agent dispatcher: something that observes the ready event and spawns a verify-fix-loop session as a sub-agent task.

Minimum wiring needed

Option A: PostToolUse hook on mc.js ready (recommended)

Element	Detail
File to modify	`~/.claude/hooks/mc-ready-gate.sh` (already fires on mc.js ready)
Addition location	After line 196 (all gates passed — currently execs mc.js directly)
Trigger	After mc.js ready succeeds, spawn verify-fix-loop as a background Task
Mechanism	`mc-ready-gate.sh` would write a trigger file to `/tmp/vfl-trigger-<mc_id>.json` containing mc_id + spec_path + domain; a daemon polls this file

The problem: mc-ready-gate.sh is a synchronous shell script. It cannot spawn a conversational agent (Task dispatch requires a running Claude Code session). It can only write a file.

Option B: pi-orchestrator.js postflight hook (most natural wiring point)

Element	Detail
File to modify	`~/system/kernel/pi-orchestrator.js`
Addition location	Inside `reportCompletion()` function, after line ~3900 (after QA gate passes)
What to add	A call to write `/tmp/vfl-trigger-<task_id>.json` with task metadata
Trigger	The daemon below polls this and dispatches

Option C: /task-postflight skill modification (cleanest for H-tasks)

Element	Detail
File to modify	`~/.claude/skills/task-postflight/SKILL.md`
Addition location	After Section 2 (PROVEO VALIDATION DISPATCH), add Section 2b
What to add	Conditional: if Proveo returns PASS AND task domain is docs/system/refactor, dispatch /verify-fix-loop before writing the postflight marker
Trigger	Manual invocation of /task-postflight already exists for H/BLOCKER tasks
Advantage	Stays within the skill conversation context — Task dispatch works naturally here

Recommended wiring (Option C + Option B trigger file):

Immediate (no new infrastructure): Add a Section 2b to /task-postflight SKILL.md that dispatches /verify-fix-loop when Proveo passes and domain is non-high-stakes. This works today for all tasks that go through /task-postflight.
Systematic (covers tasks that bypass /task-postflight): Add a trigger file write to pi-orchestrator.js reportCompletion(). A lightweight daemon polls /tmp/vfl-trigger-*.json files and — when a pi-orchestrator session is active — dispatches the verify-fix-loop skill via the existing Claude Code session.

Loop budget recommendation

Keep MAX_LOOPS = 3 (matches SKILL.md default)
For postflight auto-invocation, restrict to docs, system, refactor, polish domains only
Hard cap: $5 per invocation (already in SKILL.md)
Add timeout: 5 minutes wall-clock before auto-escalation to CEO

Escalation path when budget exhausted

Write SUMMARY.md to EVIDENCE_DIR with full loop history
Call node ~/system/tools/slack.js send alerts "[VFL-ESCALATED] MC #<id> — N/MAX loops used, last verdict: <CONFIDENCE>" (Slack, not CEO direct)
Set task status to blocked via mc.js block with reason "verify-fix-loop budget exhausted — human review needed"
John receives Slack alert and decides: (a) override + mark done, (b) dispatch additional builder, (c) extend budget via [CEO_APPROVED] token

Open Questions

Tool-level enforcement for verifier: Should the verifier's tools: field be changed from Read, Grep, Glob, Bash to Read, Grep, Glob (removing Bash) to achieve structural isolation matching fix-builder? This would break the verifier's ability to run curl -sI, git log, sqlite3 -readonly probes — which are core to its value. The tradeoff is behavioral (current) vs structural enforcement.
Conversation context for auto-dispatch: Spawning a verify-fix-loop Task requires an active Claude Code conversation. If pi-orchestrator fires after a conversation closes, there is no context to spawn into. Does the system need a persistent "factory session" that stays open to receive postflight dispatches?
High-stakes domain detection: The SKILL.md defaults unknown domains to HIGH_STAKES (no autonomous correction). For auto-invocation, domain inference from spec path heuristics will frequently return unknown. Should the default be flipped to docs for auto-invoked postflight use cases?
Proveo vs verifier: overlap management: /task-postflight already dispatches Proveo for AC-checklist QA. If verify-fix-loop is added as Section 2b, tasks will run both Proveo (AC checklist) AND verifier (atomic claims) sequentially. Is this the intended double-verification model, or should one replace the other for certain task types?
mc.js ready event vs pi-orchestrator ready: Some tasks are marked ready by human John (node ~/system/tools/mc.js ready <id>), others by pi-orchestrator after build completion, and others by /task-postflight. The auto-invocation wiring point differs for each path. A comprehensive solution needs to intercept all three paths.

Evidence Metadata

Item	Value
Files read	8
Grep/Bash tool calls	12
Live agent invocations	0
Mutations	0
Wall-clock (estimated)	~18 min
Key source files	`~/.claude/skills/verify-fix-loop/SKILL.md`, `~/.claude/agents/verifier.md`, `~/.claude/agents/fix-builder.md`, `~/.claude/skills/task-postflight/SKILL.md`, `~/system/kernel/pi-orchestrator.js` (lines 3730–3930), `~/.claude/hooks/mc-ready-gate.sh`, `~/.claude/hooks/liveness-claim-validator.sh`

BUILD-BLUEPRINT Discipline

2.3 — BUILD-BLUEPRINT Discipline Audit

Date: 2026-05-09 Auditor: sentinel-ba Scope: 17 BUILD-BLUEPRINT.md files + Mehanik gate enforcement

1. Per-Blueprint State Matrix

#	Path	Bytes	Lines	Last Modified	Status	Project Liveness
1	`~/projects/internal/basicfakta/BUILD-BLUEPRINT.md`	11,193	323	2026-04-29	SUBSTANTIAL	Last commit 10d ago (auto-backup only)
2	`~/projects/bookstack-api/BUILD-BLUEPRINT.md`	12,366	352	2026-04-29	SUBSTANTIAL	Last commit 5 weeks ago (auto-backup)
3	`~/projects/pa/BUILD-BLUEPRINT.md`	13,238	354	2026-04-29	SUBSTANTIAL	Last commit 10d ago (auto-backup)
4	`~/projects/alai-system/BUILD-BLUEPRINT.md`	3,520	75	2026-04-30	THIN (75 lines, not stub)	Last commit 6d ago (auto-backup)
5	`~/business/.../products/Tok/BUILD-BLUEPRINT.md`	27,080	637	2026-04-27	SUBSTANTIAL	Last commit 10d ago — gradle-wrapper CI fix; active
6	`~/business/.../products/BasicFakta/BUILD-BLUEPRINT.md`	12,865	332	2026-03-07	STALE (63d, no recent activity)	Last commit 9 weeks ago — test/CI only
7	`~/business/.../products/Lobby/BUILD-BLUEPRINT.md`	18,707	396	2026-03-09	STALE (61d, repo semi-active)	Last commit 6 weeks ago — feat/RLS
8	`~/business/.../products/Drop/BUILD-BLUEPRINT.md`	8,846	208	2026-05-07	PRESENT (208 lines, recently updated)	Last commit 63 min ago — MOST ACTIVE
9	`~/business/.../products/DropSrbija/BUILD-BLUEPRINT.md`	10,657	386	2026-05-08	SUBSTANTIAL	Last commit 2d ago; git-repo shared with Gotiva (anvil-fs migration)
10	`~/business/.../products/Plock/BUILD-BLUEPRINT.md`	24,175	512	2026-04-16	STALE (23d, repo dormant)	Last commit 5 weeks ago — smoke tests only
11	`~/business/.../products/Gotiva/BUILD-BLUEPRINT.md`	27,112	556	2026-03-11	STALE (59d)	Last commit 2d ago was chore/anvil-fs (migration commit, not product work)
12	`~/business/.../products/Bilko/BUILD-BLUEPRINT.md`	38,303	530	2026-05-08	SUBSTANTIAL	Last commit 10 min ago — extremely active
13	`~/business/.../sales/outreach/sintef/BUILD-BLUEPRINT.md`	1,943	49	2026-04-27	TEMPLATE/STUB (49 lines, 1,943 bytes — under threshold)	Last commit 2d ago was chore/anvil-fs only
14	`~/business/.../web/BUILD-BLUEPRINT.md`	4,636	110	2026-04-27	THIN	Last commit 2d ago — feat/redirect
15	`~/business/.../finance/akershus-fylke/BUILD-BLUEPRINT.md`	1,486	33	2026-05-08	TEMPLATE/STUB (33 lines; per MC #99886 Decision 7: "move akershus OUT of products/")	Last commit 2d ago chore only
16	`~/clients-external/snowit-site/BUILD-BLUEPRINT.md`	3,427	67	2026-04-28	THIN	Last commit 2 hours ago — active gitignore hygiene
17	`~/clients-external/lumiscare-variants/lumiscare/BUILD-BLUEPRINT.md`	37,426	637	2026-05-09	SUBSTANTIAL	Last commit 2 hours ago — security fix; MOST RECENTLY UPDATED

Summary counts

SUBSTANTIAL (>10,000 bytes, real content): 8 — basicfakta, bookstack-api, pa, Tok, DropSrbija, Gotiva, Bilko, lumiscare
PRESENT / ADEQUATE (200–10,000 bytes, real content): 2 — Drop, alai-system
THIN (< 5,000 bytes, functional but sparse): 3 — web, snowit-site, alai-system
TEMPLATE/STUB (< 2,000 bytes or <50 lines with no real content): 2 — sintef, akershus-fylke
STALE (>30d without update, repo active): 4 — BasicFakta (63d), Lobby (61d), Gotiva (59d), Plock (23d)

Note: STALE classification applies where the product repo has had meaningful commits but the blueprint has not been updated. Plock is borderline (23d, repo dormant).

2. Mehanik Gate Truth Check

What Mehanik requires (tool-verified from `~/.claude/agents/mehanik.md`)

Phase T of the GOTCHA workflow states:

ls {project_path}/BUILD-BLUEPRINT.md — MUST exist
Read the file (confirm contents match task scope)
Circuit Breaker #2: "BUILD-BLUEPRINT.md not read — evidence of Read call required in session"

Assessment: The requirement is FORMALLY A HARD BLOCK. CB#2 fires if the blueprint is not read (not just present). The hook ~/.claude/hooks/pre-dispatch-gate.sh also enforces a secondary check: it runs blueprint-check.js against the project path stored in the Mehanik cleared token and blocks dispatch if score < 60.

Enforcement quality issues identified

Issue A — Hook is warn-only for missing MC ID. When the Task prompt has no MC #NNNN pattern, the hook exits 0 with a stderr warning only. Tasks dispatched without an MC ID bypass both the Mehanik cleared-token check and the blueprint-score gate entirely.

Issue B — mehanik_session_id: unknown in all inspected tokens. Both tokens inspected (99886 and 100150) show mehanik_session_id: unknown. The cleared token was written, proving Mehanik ran, but the session binding is absent — meaning the hook cannot verify that the same session cleared the task vs. a stale token from a prior session. Token expiry (4h) partially mitigates but does not eliminate this gap.

Issue C — Blueprint score threshold set at 90 but tokens show WARN at 80 and 65. Both inspected dispatches show blueprint_check_result: WARN with scores below the 90 threshold, yet dispatch proceeded. The hook's blueprint-check.js integration exists (~/system/tools/blueprint-check.js is present), but the pre-dispatch hook only exits 2 (block) if verdict is NOT_READY. The WARN path allows dispatch. The 90-point threshold in the token file is never enforced as a gate.

Issue D — Token expiry not enforced in hook. The hook does not parse expires_at from the cleared file. A token written 23 hours ago (within a session restart) would still pass. The 4h expiry in the token is advisory metadata only.

Sample of 5 recent dispatches

MC ID	Cleared token exists?	Blueprint cited in token?	Blueprint score	Dispatch allowed?
99886	YES	Bilko/BUILD-BLUEPRINT.md	80 (WARN)	YES — WARN not blocked
100150	YES	Drop/BUILD-BLUEPRINT.md	65 (WARN)	YES — WARN not blocked
100150	YES	Drop/DEPLOY-MAP.md cited	—	YES
99910 (MC Claim Protocol)	YES (`/tmp/mehanik-cleared-99910`)	—	Not inspectable (token may have expired and been overwritten)	YES
99886	YES	Bilko — per DOD evidence: "Mehanik CLEAR /tmp/mehanik-cleared-99886"	80	YES

Token count in /tmp: 113 mehanik-cleared tokens present (range: #10063 to #100173). Volume indicates Mehanik is running regularly — it is not being bypassed entirely.

Gate verdict: PARTIALLY REAL. Blueprint presence is hard-blocked. Blueprint read is required and recorded in the token. However, the score-based quality gate (threshold 90) is advisory — WARN scores pass. The session-binding gap means cleared tokens could theoretically be reused across sessions. The missing-MC-ID path is a complete bypass vector.

3. Blueprint-vs-Reality Drift Score

Bilko (MOST ACTIVE)

Blueprint claims:

"API Framework: Ktor 3.4.0 / Kotlin 2.3.0 on JVM 25" — Cloud Run deployed
"Database: PostgreSQL 15" — Cloud SQL
"Status: MVP dev — frontend implemented with mock data, backend built"

Actual state (tool-verified):

gcloud run services list shows: bilko-api-stage, bilko-api-demo, bilko-web-stage, bilko-web-demo, bilko-intesa-demo all TRUE; bilko-staging-api FALSE (unhealthy)
Drop is on Azure VM; Bilko is on GCP Cloud Run — consistent with blueprint claim
Blueprint says "Status: MVP dev" but there are 5 live Cloud Run services including bilko-intesa-demo (suggesting Intesa bank integration demo exists)

Drift score: LOW-MEDIUM. Infrastructure matches. The "MVP dev with mock data" status language is understated given live deployed services. Blueprint was last updated 2026-05-08 (yesterday) — reasonably current.

Drop (MOST RECENTLY COMMITTED)

Blueprint claims:

"Azure VM vm-drop-prod (Sweden Central)" + docker-compose
"Database: PostgreSQL 16 via Drizzle ORM in docker-compose on Azure VM"

Actual state (tool-verified):

curl -sI https://app.getdrop.no returns HTTP/2 200 — production is live
Response headers show nonce-based CSP (Next.js pattern) — consistent with Next.js 15 claim
Blueprint was rewritten 2026-04-30 to fix the AWS phantom; it now correctly reflects Azure VM
Most recent commit (63 min ago): staging CI/CD OIDC fix — blueprint does NOT mention staging VM yet (deploy token shows vm-drop-stage staging path)

Drift score: LOW. Production deployment matches blueprint. Staging environment exists in deployment reality but blueprint only covers production — minor documentation lag.

Tok (ACTIVE BUT NO RECENT BLUEPRINT UPDATE)

Blueprint claims:

"Database: PostgreSQL 15 (Cloud SQL)"
"PSD2 Cert: QWAC/QSEAL — DigiCert/GlobalSign — mTLS for Croatia"
"Status: Core implementation complete — all 8 development gates DONE"

Actual state:

No gcloud run services list results for Tok (not visible in current GCP project scope)
Blueprint last updated 2026-04-27 (12d ago); last meaningful commit was 10d ago (gradle-wrapper fix unblocking CI since March)
The gradle-wrapper CI was broken since March 2026 — meaning "all 8 gates DONE" may be technically true for code but CI was broken for 6+ weeks

Drift score: MEDIUM. The product-gate claim is technically accurate but CI was silently broken for 2+ months — a fact not reflected in the blueprint status line. PSD2 cert claim is unverifiable without SSH to the Tok deployment.

4. Cross-Cutting Findings

No holding-company blueprint

~/business/ALAI-Holding-AS/BUILD-BLUEPRINT.md — ABSENT. There is no top-level document explaining how the portfolio of products relates, shared infrastructure, or cross-product dependencies (e.g., Tok feeding Bilko). Each product is an island. This is a gap for new agents onboarding to the system who need portfolio-level context.

Blueprint versioning

Blueprints ARE git-tracked in their respective product repos. git log --follow -- BUILD-BLUEPRINT.md on Bilko shows at least 3 tracked commits; Drop shows the AWS-to-Azure canonical rewrite is a committed event with a clear commit message and MC reference. This is genuine version history — drift can be diagnosed by diffing commits.

However, there is no automated drift alert. Blueprint age vs. commit recency is never surfaced to John or CEO unless a sentinel audit runs manually.

Tenants without blueprints

~/system/ — has ~/system/BUILD-BLUEPRINT.md (EXISTS — confirmed)
~/personal/ — NO BLUEPRINT (expected: personal scope, not a product)
~/clients-external/ — only snowit-site and lumiscare are covered; MEDON client (~/business/ALAI-Holding-AS/pipeline/CodeCraft/clients/MEDON/) has a CHANGELOG.md in its shopify-app but NO BUILD-BLUEPRINT.md. This is a Mehanik bypass vector for any MEDON dispatch.
DropSrbija blueprint exists but the Gotiva blueprint is 59d stale — yet the git repo for both was recently touched (anvil-fs migration). This creates a false "recently updated" signal.

CHANGELOG without BUILD-BLUEPRINT

Within active project trees (excluding node_modules): MEDON shopify-app has a CHANGELOG.md without a blueprint. All node_modules CHANGELOG.md hits are false positives (dependency changelogs, not ALAI products).

5. Blueprint → Mehanik → Agent Dispatch Trace: MC #99886

Task: CI/CD Standardization — FAZA 2 — canonical refresh (Petter Graff)

Mehanik ran? YES. Token /tmp/mehanik-cleared-99886 present. Timestamp: 2026-05-08T21:06:23.121Z.

Blueprint cited?

blueprint_read: /Users/makinja/business/ALAI-Holding-AS/products/Bilko/BUILD-BLUEPRINT.md
This is the Bilko blueprint. The task is a system-wide canonical spec edit, not a Bilko-specific build task.
The project path assigned was Bilko's path, which means Mehanik's blueprint check was anchored to Bilko even though the deliverables (~/system/specs/cicd-canonical-v3-drafts/) are system-level. This is a scope-mismatch in the Mehanik gate — the blueprint read is nominally satisfied but the product checked (Bilko) is not the target of the changes.

Blueprint score: 80/100 (WARN). Dispatch allowed.

Agent output referenced blueprint sections? The DOD evidence in MC #99886 references the task as a "system-wide canonical spec edit" and notes 5 issue-areas in the v3 drafts — none reference Bilko blueprint sections. The blueprint read appears to have been a gate-pass ritual, not a content-informing step.

Dispatch outcome: Deferred (not dispatched to FlowForge) — "executive-side decision to defer flowforge run until parallel work coordinates." The Mehanik clear token was written but the agent run was held. This is the correct behavior per CEO decision, but it reveals that Mehanik clearance does not guarantee agent execution — it is one gate in a multi-gate flow.

Trace verdict: Mehanik ran and wrote a token. The blueprint cited was topically mismatched (Bilko blueprint for a system-spec task). The blueprint score gate passed despite being below threshold. Agent was not dispatched (deferred). Blueprint content did not visibly inform the dispatch.

6. Open Questions

Mehanik project_path heuristic: How does Mehanik determine which project_path to use when the task is cross-product or system-level? For #99886, Bilko was used for a system-spec task. Is this John's input, or Mehanik's inference? If inference, the blueprint check is unreliable for cross-cutting tasks.
Score threshold enforcement: The blueprint_threshold_applied: 90 field in cleared tokens is never enforced as a hard gate. Drop scored 65 and dispatch was allowed. Should the threshold be lowered to match operational reality, or should the WARN-to-BLOCK escalation be implemented?
Token reuse across sessions: mehanik_session_id: unknown in all inspected tokens. Is there a plan to enforce session binding? Without it, a cleared token from a prior CEO session could authorize a dispatch in a new context.
Gotiva and Lobby stale blueprints: Both products are 59d+ stale. Are they in maintenance mode or abandoned? If active, their blueprints are Mehanik bypass risks for any dispatch — the gate will pass but Mehanik will be reading outdated architecture.
MEDON client coverage: No BUILD-BLUEPRINT.md exists for the MEDON shopify-app. If John receives a MEDON task, Mehanik's Phase T will fire ls {project_path}/BUILD-BLUEPRINT.md → BLOCKED. Is the MEDON client expected to receive blueprint coverage, or is it out of scope?

7. ROI Lens (sentinel-ba)

Is the blueprint pattern earning its overhead?

Direct value delivered:

Blueprint presence as a Mehanik gate prerequisite has prevented scope hallucination at the dispatch level. The 113 mehanik-cleared tokens in /tmp represent 113 gate events where someone was forced to confirm a blueprint existed and was read. This is a real forcing function.
The Drop AWS phantom rewrite (MC #10353) is a concrete example where the blueprint served as the canonical source of truth that agents were required to consult — and where a discrepancy (aspirational AWS docs treated as ground truth) was detected and corrected with a committed blueprint update.
The Bilko blueprint (38KB, 530 lines, git-tracked) is the most thorough — it provides stack, ADRs, domain context, and deployment architecture. It has demonstrably prevented repeated infra hallucination on Bilko tasks.

Overhead cost:

17 blueprints exist, 8 are genuinely substantial. The 2 stubs (sintef/akershus) add near-zero value and should be either expanded or removed (their Mehanik gate pass is hollow).
Blueprint maintenance is manual and unalerted. Stale blueprints (BasicFakta 63d, Lobby 61d, Gotiva 59d) represent a risk: Mehanik passes the gate but the agent reads outdated architecture. The overhead of writing blueprints is paid; the staleness risk is not managed.
The 90-point score threshold being advisory-only means the quality gate was designed but not deployed. This is overhead (blueprint-check.js runs on every dispatch) with only partial benefit (WARN path is free).

Net verdict: POSITIVE ROI, but with a quality gap. The blueprint pattern is not theatrical — it is a genuine gate that has caught real hallucinations. However, the enforcement has two systemic weaknesses: (1) stale blueprints pass the gate silently, and (2) the score threshold is never enforced as a block. Fixing these two issues would cost approximately 1–2 hours of system work and would sharply increase the ROI-per-blueprint.

Priority recommendations:

HIGH — Enforce score threshold or lower it. Either block at score < 60 (matching current floor observed in practice), or officially downgrade the threshold. WARN-at-65-and-dispatch is worse than an honest 60-point threshold that blocks.
HIGH — Add staleness alert. A daily check: if blueprint last-modified > 30d AND project has had commits in last 14d → surface warning to John. Zero build cost (can be added to existing daemon fleet).
MED — Expand or remove stub blueprints. sintef (49 lines) and akershus-fylke (33 lines) are hollow gates. MC #99886 Decision 7 already proposes moving akershus out of products/ — execute this and either write a real blueprint or remove the gate.
LOW — Session binding for Mehanik tokens. Low urgency given 4h expiry, but mehanik_session_id: unknown should be resolved to prevent cross-session token reuse on long-running tasks.

Health Matrix

3.1 Health Matrix — Functional Probe Results

Audit date: 2026-05-09 | Auditor: sentinel-tester | Phase: P3 (functional probes)

Health Matrix

Component	Test	Status	Evidence (cmd + snippet)
A1. mem0/qdrant	POST write (audit-test user)	PARTIAL	`curl http://localhost:9000/add -d '{"text":"audit-2026-05-09 ping test","user_id":"audit-test"}'` → `{"result":{"results":[]},"status":"added"}`. Read-back via `/search` returned `count:1` but `results:[]` — memory acknowledged as added but semantic search returned empty results. Write acknowledged; retrieve path unreliable.
A2. LightRAG	GET /health + POST /query	WORKS	`curl localhost:9621/health` → `{"status":"healthy","core_version":"1.4.16","pipeline_busy":false}`. POST /query `{"query":"what is ALAI","mode":"naive"}` → 3-paragraph narrative with citations. Full round-trip confirmed.
A3. HiveDB intel	SELECT COUNT(*) FROM intel	WORKS	`sqlite3 ~/system/databases/hivemind.db "SELECT COUNT(*) FROM intel;"` → `17560`. Latest entries dated 2026-05-09 19:11:24. Write-side confirmed via `hivemind.js query "ALAI"` — 8 results returned, including entries written today. Read AND write both functional.
A3b. HiveMind writer	Confirm write path exists	WORKS	`node ~/system/agents/hivemind/hivemind.js query "ALAI"` → 8 live results with today's timestamps. Writer: daemon-fleet-watchdog posts alerts; email-agent posts task alerts. Multiple live writers confirmed.
A4. Chroma	chroma-mcp responsive	BROKEN	`curl http://localhost:8000/api/v1/collections` → no response (empty). Port 8000 not listening. No chroma process found. chroma-mcp listed in settings.json but no running service.
A5. .md auto-memory	Fresh writes landing?	PARTIAL	`ls -la ~/.claude/projects/-Users-makinja/memory/` — most recent file mtime is `2026-04-30 16:45` (feedback_validation_enforcement_active). `MEMORY.md` itself last written `2026-05-09 19:04` (today, by John session). No automated daemon auto-writing .md files found — writes are manual/session-driven only. Memory lands, but no auto-append pipeline.
B1. HiveMind read API	Any tool returns intel?	WORKS	`node ~/system/agents/hivemind/hivemind.js read --limit 3` returns intel rows. `hivemind.js query "ALAI"` returns 8 records. P1 claim of "NO read API" is INCORRECT — read API exists and functions. hivemind-mcp.js also exposes `hivemind_read`, `hivemind_query`, `hivemind_semantic_query`.
C1. pi-orchestrator	Process running?	PARTIAL	`ps aux
C2. pi-orch mock mode	Is it truly mock?	PARTIAL	`grep "mock" ~/system/kernel/pi-orchestrator.js` — no `alai-config-mock.json` reference found. Config `offlineMode: false`, `enabled: true`. Latest health state shows `Verdict: CRITICAL` (2026-05-06). Durable-runner bridge healthy. Process running but HTTP port silent and no recent dispatch logs after 2026-03-19. Likely dispatching but to BROKEN downstream (Ollama).
D1. Verifier auto-invocation	verify-fix-loop grep	PARTIAL	`grep -rn "verify-fix-loop" ~/.claude/skills/` → SKILL EXISTS at `~/.claude/skills/verify-fix-loop/SKILL.md`. Skill is MANUAL-TRIGGER only — "Trigger phrases: verify-fix-loop, auto-verify and fix". No daemon or hook auto-invokes it. P2 verdict ABSENT is partially wrong: skill exists but auto-invocation is absent.
E1. Library skill	node ~/system/tools/library.js list	WORKS	Returns 13 cookbooks (alai-full:33 skills, dev:17, business:12, security:10, etc.) + 11 defaults. Fully functional CLI. No external endpoint required for `list`.
F1. Mehanik gate	Token files past 7d	WORKS	`ls /tmp/mehanik-cleared-*` → 10 token files found, all from 2026-05-09. Most recent: `mehanik-cleared-100173` created 18:29:30 today. Corresponding MC #100173 (Bilko landing pages UX audit) confirmed open+assigned to vizu. Token→dispatch correlation confirmed.
G1. com.alai.pi-orch-health	Daemon exit reason	BROKEN	`launchctl print gui/501/com.alai.pi-orch-health` → `state: not running`. Last health report `Verdict: CRITICAL` (2026-05-06). Scheduled health monitor is itself failing to run consistently.
G2. com.alai.cost-daily-report	Daemon exit reason	BROKEN	`launchctl print gui/501/com.alai.cost-daily-report` → `state: not running`. No exit code visible via launchctl; likely script dependency failure (BW session or Slack).
G3. com.alai.chain-phantom-detector	Script exists?	BROKEN	`ls ~/system/daemons/chain-phantom-detector*` → NOT FOUND. plist references `~/system/tools/phantom-link-detector.js` — script name mismatch or renamed. Daemon registered but script path may differ.
G4. com.john.alaiml-retrain	Exit reason	BROKEN	`state: not running`. Script path: `~/ALAI/internal/projects/alaiML/scripts/retrain.sh` — path under old `~/ALAI/` tree (now symlink). Path itself may still resolve via symlink, but script likely fails on missing MLX or stale config.
G5. com.alai.weekly-planning	Script exists?	BROKEN	`ls ~/system/daemons/weekly-planning*` → NOT FOUND. plist references `~/system/tools/weekly-planning.sh`. Script absent from daemons dir.
H1. RAG ingest queue	Current queue depth	PARTIAL	`cat ~/system/state/rag-drain.prom` → total 454 (bookstack:442, mc-outcomes:9, evidence:2, specs:1). NOTE: prom file mtime is 2026-04-23 17:59 — 16 days stale. rag-drain-worker went `running→down_exit_256` today per HiveMind alert #64900. Queue depth of 454 is last known, not live. P1 claim of 946 appears to be an older snapshot.

Summary Counts

Status	Count
WORKS	5
PARTIAL	6
BROKEN	6

Surprises (Contradictions vs P1/P2)

1. HiveMind READ API EXISTS — P1 claim "no read API" is WRONG

P1 (1.1-memory-plane.md) stated HiveMind has no read/query API. Ground truth: hivemind.js exposes read, query, semantic_query, hybrid_query subcommands, all functional. hivemind-mcp.js wraps all of them as MCP tools. Live query returned 8 results dated today. This is the most significant P1/P2 contradiction.

2. pi-orchestrator HTTP port 8401 dead — process alive but silent

The pi-orchestrator process (PID 75750) is running. Config shows httpPort: 8401. Port 8401 refuses connections. The actual active HTTP bridge is the durable-runner on port 3052 (uptime 1,726,326s = ~20 days). The kernel's own HTTP endpoint never came up, or stopped. Dispatch claims in P1/P2 must be qualified: pi-orch kernel runs, but HTTP control plane uses a different process entirely.

3. RAG queue: 454, not 946 — and the metric is 16 days stale

P1/P2 cited 946 queued. The prometheus file shows 454 and was last written 2026-04-23. The rag-drain-worker crashed today (exit 256). The queue is not draining, the metric is not being updated, and the actual backlog is unknown. True state: drainer is DOWN, queue age unknown.

4. verify-fix-loop SKILL EXISTS — P2 "ABSENT" partially wrong

P2 said verifier auto-invocation is ABSENT. The skill ~/.claude/skills/verify-fix-loop/SKILL.md exists and is indexed. The verdict should be: skill exists as MANUAL-trigger, not auto-invoked by any daemon or hook. P2 was right about auto-invocation being absent but wrong to imply the capability doesn't exist at all.

5. mem0 write acknowledged but search returns empty

mem0 write → status: added. Read-back search → count: 1 but results: []. The qdrant backend is running (health endpoint confirms backend: qdrant, collections: ["mem0migrations","sessions","hivemind","mem0_john","knowledge"]). The "audit-test" user_id has no collection, so add may go into a separate namespace not searched. Not a mem0 failure per se — the route logic for new user_id collections may differ from existing ones. Write side appears functional; retrieval for new users is unconfirmed.

Open Questions

mem0 user_id routing: Does mem0 create a new Qdrant collection per user_id, and does search also need a pre-existing collection to return results? The audit-test user returned count:1 but empty results — is this a namespace creation lag or a real retrieval bug?
pi-orch HTTP port 8401: Why is port 8401 not open even though the process is running? Is the HTTP server initialization gated behind a condition (Ollama health check, etc.) that's failing?
durable-runner bridge (port 3052) uptime 20 days: This is the actual dispatch layer. Is it processing tasks, or has it been idle since March? No recent task dispatch logs found post-2026-03-19.
rag-drain-worker exit 256: What is the exact failure? The queue at 454 is stale and not draining. LightRAG is healthy. The ingest pipe is broken somewhere between queue and LightRAG.
chain-phantom-detector plist vs actual script name: plist says phantom-link-detector.js. Is this the same script? Does it exist under tools/?
MEMORY.md auto-write: There is no daemon or hook that automatically appends to MEMORY.md. All memory entries are written manually by John during sessions. If a session ends without a write, the event is lost. Is this intentional or a gap?

Petter Synthesis

4.1 — Petter Graff Executive Synthesis

AI Factory Audit — 2026-05-09 Auditor: Petter Graff (CodeCraft — Lead Architect) Synthesizing: P1 reports 1.1–1.4, P2 reports 2.1–2.3, P3 report 3.1 Method: P3.1 live-probe data overrides P1/P2 file-based claims where they contradict.

Section 1 — Executive Summary (Bosnian)

Situacija

John ima dobro zamišljenu arhitekturu: kontrolni sloj sa Mehanik kapijom, memorijski sloj sa pet pohrana, RAG pipeline za znanje, tim od 66 agenata u 12 virtualnih kompanija, i orkestratorski kernel koji bi trebao sve automatizirati. Na papiru to izgleda kao AI fabrika. U stvarnosti, 62.5% advertiziranih tokova podataka i kontrole su mrtvi ili degradirani. Sistem radi kao ručna radionica — John lično proslijedi svaki zadatak, lično provjeri, lično zatvori. Automatizacija postoji kao infrastruktura, ali nije spojena. Ono što funkcioniše: HiveDB/HiveMind intel bus, LightRAG lokalni upis, Mehanik kapija (djelimično), alati (250+ živih), i 74 calendar-scheduled daemona koji rade ispravno. Ono što je teatar: pi-orchestrator (živ proces, nema stvarnih dispatcheva od marta), verify-fix-loop (skill postoji, niko ga nikad ne pozove automatski), mem0 (93K+ vektora, nula aktivnih pisača), četiri "fantomske" kompanije bez routinga, i 35 chain YAML fajlova bez nijednog executora.

5 najkritičnijih praznina (rangirano po IMPACT × SEVERITY ÷ EFFORT)

RAG ingest pipeline — potpuno blokiran (Vaultwarden timeout, 3,150+ stavki u redu (posljednji poznati snapshot: 454 dana 2026-04-23; live SQLite prebrojan 2026-05-09 = 3,150), drain-worker pao danas)
pi-orchestrator u mock/broken modu — kernel živi, ali ne dispatcha ništa od marta 2026; sav dispatch ide kroz Johna ručno
Verifier loop — sposoban ali ne pozvan — verify-fix-loop skill postoji, nije spojen ni na jedan automatski okidač; CEO je jedini QA gate
Memorijska anarhija — 5 pohrana, nijedna nije System of Record; mem0 ima 93K vektora koje niko ne piše ni čita; .md fajlovi su defacto SoR, ali to nije dizajnirano tako
Agent routing rupa — validator (44 pozivanja u skill fajlovima) i distiller (21 pozivanje) nemaju ni jedan unos u specialist-mapping.json; 7 mapirani agenti su fizički nedostupni

Šta popraviti prvo

Jedna stvar otključava više od svega ostalog: RAG drain-worker — jedan credential fix (Vaultwarden session za LightRAG CF Access) otključava 3 adaptera odjednom i prazni 454+ stavki iz reda. Direktno za njim: pi-orchestrator real config — razumjeti zašto HTTP port 8401 ne radi i zašto nema dispatcheva od marta; bez ovoga, fabrika ostaje ručna. Treće po prioritetu: verify-fix-loop wiring — dodati Section 2b u /task-postflight SKILL.md, što ne zahtijeva novu infrastrukturu i odmah uklanja CEO-a iz petlje za docs/system/refactor zadatke. Ova tri fixa su S/M napora i zajednički konvertuju fabriku iz "John kao ručni dispatcher + QA" u nešto što nalikuje automatiziranom sistemu.

Section 2 — Plan vs Reality Delta Table

Subsystem	Plan Claim	Reality (audit-verified)	Delta	Severity
Memory plane	mem0 is the structured SoR for John's personal facts; LightRAG is secondary RAG store	.md files are the actual SoR (Claude Code native). mem0 API has 0 active writers, 865 stale facts. LightRAG is primary RAG (999 docs, healthy). 5 parallel stores, none designated SoR.	Complete SoR inversion; mem0 is a ghost server with stale data nobody reads	H
HiveMind	Intel broadcast bus; P1 implied no read API	HiveDB SQLite 17,560 rows, live writes today. `hivemind.js read/query/semantic_query` all functional. hivemind-mcp.js wraps all. Read API EXISTS and works.	P1 overstated the gap. HiveMind is the healthiest store in the factory.	L
Tools shed	250+ live tools, manifest current	443 files on disk; manifest 6 weeks stale; 12 un-owned tools; 50 .bak files >14d old; 1 credential-bearing filename (security risk); 100 dead-code tools	Manifest does not reflect reality. Security artifact present. Dead code accumulating.	M
Agent fleet	29 agents routable via specialist-mapping.json	44% mapping coverage (29/66). validator (44 skill refs) and distiller (21 refs) absent from mapping. 7 mapped agents unreachable on disk. 4 companies invisible to routing. 35 chains have executors (chain-runner.js + chain-runner.sh) but executors are un-wired from active skills and broken at daemon invocation.	Routing table is too thin to be trusted as source of truth. Silent dispatch failures guaranteed.	H
Daemon fleet	148 daemons maintaining system health	20 erroring, 5 scripts deleted (exit 127), 2 in infinite crash loop. RAG pipeline fully deadlocked. Cost reporting dark 10+ days. pi-orch health monitor script deleted.	Monitoring is blind to key system health. 13% error rate.	H
pi-orchestrator	Automated dispatch kernel; picks up MC tasks, fires specialist agents	PID 75750 alive. HTTP port 8401 dead. No dispatch logs post-2026-03-19. Durable-runner bridge (port 3052) live but dispatch activity unclear. Config: offline-mode=false but effectively not dispatching.	Kernel running in operational void. All actual dispatch is manual-John.	H
Verifier loop	verify-fix-loop auto-invokes after mc.js ready for eligible tasks	Skill exists, internally correct. Zero wiring to any automated trigger (no hook, daemon, pi-orch code calls it). CEO is de-facto verifier.	Built but unwired. Capability without activation.	H
BUILD-BLUEPRINT discipline	Mehanik enforces blueprint read before any dispatch; 90-point score gate	Blueprint read IS required and enforced as hard block (CB#2). But: WARN scores (65, 80) allow dispatch — 90-point threshold is advisory only. 4 blueprints 59d+ stale. Missing-MC-ID path bypasses gate entirely.	Gate is real but porous. Score enforcement is theater. Session binding absent.	M
Library skill	Skill library accessible for cookbook-based task execution	`node ~/system/tools/library.js list` returns 13 cookbooks, 11 defaults. CLI fully functional.	WORKS. No gap.	L
Virtual companies	12 companies, each routable via discover.js → specialist-mapping.json	4 companies (Axiom, Datavera, Resolver, Lexicon) have full persona dirs, CLAUDE.md, 5–9 internal agents — but zero entries in specialist-mapping.json. Cannot be routed via normal John → discover.js flow.	33% of the company fleet is phantom infrastructure.	M

Section 3 — Top-10 Gaps Ranked

Composite priority = Leverage × Severity ÷ Effort (S=1, M=2, L=4)

#	Gap Name	Subsystem	Evidence	Leverage (1–10)	Severity (1–10)	Effort	Composite	Proposed Fix
1	RAG drain-worker deadlock	Daemon fleet / Data plane	1.4 §3, 2.1 §B Dead Edge 2, 3.1 H1 — 3,150 items queued (live SQLite 2026-05-09; stale prom file shows 454 as of 2026-04-23)	9	9	S	81	Fix Vaultwarden session so rag-drain-worker can reach LightRAG CF Access endpoint; confirm `/tmp/bw-session` valid.
2	pi-orchestrator dispatch broken	Orchestration kernel	1.4 §4, 2.1 §A Dead Edge 1, 3.1 C1/C2	10	9	L	22.5	Diagnose why HTTP port 8401 is silent and why no dispatch logs post-March; restore real MC API config or repair durable-runner bridge as authoritative dispatch path.
3	Verifier loop unwired	Verifier / QA	2.2 §2 verdict ABSENT, 2.1 Dead Edge 3, 3.1 D1	8	8	M	32	Add Section 2b to /task-postflight SKILL.md: conditional dispatch of /verify-fix-loop for docs/system/refactor domains when Proveo PASS; no new infrastructure required.
4	mem0 SoR wire break	Memory plane	1.1 §4, 2.1 §B Dead Edge 24/25	6	7	M	21	Designate .md files as official SoR or wire a PostToolUse hook that calls `POST localhost:9000/add` on every memory .md write; choose one, document it, retire the other.
5	Agent routing table incomplete	Agent fleet	1.3 §A concerns A/B, 2.1 §C	7	8	M	28	Add validator, distiller, mehanik, evidence-verifier, dzevad-jahic, fix-builder to specialist-mapping.json; sync 8 definitions-only agents to ~/.claude/agents/.
6	5 deleted scripts with live plists	Daemon fleet	1.4 §2 exit 127 analysis	5	7	S	35	Unload plists for pi-orch-health, cost-daily-report, daily-planning, legal-docs-azure-sync, mcp-health-check; restore scripts or remove LaunchAgents permanently; stop infinite crash loops.
7	4 phantom companies unroutable	Agent fleet / Routing	1.3 §2, 2.1 §C	5	6	M	15	Add Axiom, Datavera, Resolver, Lexicon to specialist-mapping.json with at least one dispatch agent each; or officially mark them as experimental and document the direct-session access pattern.
8	Blueprint score gate advisory-only	BUILD-BLUEPRINT discipline	2.3 §2 issues A/B/C	6	5	S	30	Lower enforced threshold to 60 (matching observed practice floor) or escalate WARN to BLOCK in `pre-dispatch-gate.sh`; fix missing-MC-ID bypass path.
9	Chroma and stale mem0 orphan stores	Memory plane	1.1 §3, 3.1 A4	3	5	S	15	Audit Chroma origin; if no active reader/writer, delete. Archive or document stale mem0_john/knowledge collections. Reduces cognitive confusion and false recovery paths.
10	B2 storage cap exceeded	Daemon fleet / Backup	1.4 §3 backup layer, 2.1 Edge 38	4	7	S	28	Raise Backblaze B2 bucket cap in the console (billing action); verify litestream replication is picking up where nightly snapshots fail.

Section 4 — Architectural Conclusions

The fragmented memory plane

The architecture planned for mem0 as the System of Record for John's personal facts, with LightRAG as the document retrieval layer. What exists is five parallel stores — mem0/Qdrant (93K+ vectors, zero active writers), LightRAG (999 docs, healthy), HiveDB SQLite (17K rows, healthy), Chroma (6.5K embeddings, unknown origin, no active reader), and 123 .md files (the actual write target of Claude Code's native auto-memory). Each store evolved independently. The .md files won the write race by default — Claude Code writes them natively without any configuration. The lightrag-auto-ingest.sh hook then routes .md writes to LightRAG, making .md→LightRAG the de-facto pipeline. mem0 accumulated 865 facts in its setup phase and has received nothing since. Nobody documented this inversion as a decision. The result is a system where the architecture document says one thing, the code does another, and the divergence is invisible until an audit reveals it. There is no reconciliation daemon, no SoR designation in any machine-readable config, and no alert when the stores diverge. This is not a failure of implementation — it is a failure of architectural governance. The fix is to pick a winner, write it down, and wire everything else as a derivative.

Capability without auto-invocation

Three significant capabilities were built, tested, and deployed — and then left sitting idle because the trigger that would activate them was never wired. The verify-fix-loop skill is fully specified: it decomposes acceptance criteria into atomic claims, dispatches a verifier agent, optionally dispatches a fix-builder, loops up to three times, and escalates cleanly. It has a cost cap. It handles domain escalation policy. It works when a human types a trigger phrase. It has never been activated automatically. The same pattern holds for mem0 — the server is running, the Qdrant collections are populated, the API surface is correct, but no hook or daemon calls the write endpoint. The library skill is functional as a CLI but there is no daemon that proactively loads relevant cookbooks before task dispatch. This is an engineering pattern I recognize from large enterprise projects: the team builds the component, writes the spec, declares it done, and moves to the next feature. Integration — the wiring between components — is treated as an afterthought. In a distributed system, integration is the product. A verifier that nobody calls is not a verifier. It is documentation.

The phantom infrastructure pattern

The audit found four virtual companies (Axiom, Datavera, Resolver, Lexicon) with complete organizational infrastructure: persona directories, CLAUDE.md files, company.json, README, 5–9 internal agents each. None appear in specialist-mapping.json. There is no routing path from John's normal dispatch flow to any of them. Similarly, 35 chain YAML files define multi-step agent pipelines — and chain-runner.js (~~/system/tools/chain-runner.js, MC #1902) and chain-runner.sh (~~/system/tools/chain-runner.sh, Pillar #5) both exist as chain executors. However, (a) no active skill invokes them (skills call agents inline), (b) the three chain-related daemons that call chain-runner.sh all exit 1 due to downstream failures, and (c) chain-runner.js has no active caller in the current daemon or skill fleet. The chain YAML files are not dead because no executor exists — they are dead because the executors are broken or un-invoked. Five LaunchAgent plists reference scripts that were deleted at some point, leaving the daemons in permanent exit-127 loops. Two of them have KeepAlive.Crashed=true, meaning launchd restarts them on every crash, generating hundreds of failed process spawns per day. Phantom infrastructure has a cost: it consumes cognitive space during troubleshooting, generates false signals in health dashboards, and creates the illusion of capability that does not exist. The four phantom companies are particularly expensive because they imply John has routing coverage he does not have — if a task arrives that maps to Lexicon or Resolver capability, the system will not tell John it cannot route it. It will silently fall through.

The dual-process dispatch pattern

pi-orchestrator (PID 75750) is running. Its HTTP port 8401 refuses connections. The durable-runner bridge (port 3052) has been up for 20 days. These are two separate processes serving what should be one control plane. The kernel's own HTTP endpoint appears to have failed silently at some point, and the bridge was deployed as a workaround. No dispatch logs exist after 2026-03-19, which means either the system has not dispatched a task automatically in 50 days, or it is dispatching via a path not captured in the logs. The pi-orch-health script that would tell us was deleted on 2026-05-06 — the monitoring for the orchestrator is gone precisely when we need it most. The last recorded verdict from that monitor was CRITICAL. This dual-process split is not an architecture — it is an accident that has calcified into the operating model.

What the audit reveals about John as AI Director

John's CLAUDE.md presents a picture of a system where John delegates, monitors, and reports — while automation handles dispatch, verification, and completion. The audit reveals the actual operating model: John manually dispatches every specialist agent in the current conversation, manually verifies outputs (or asks the CEO to), and manually calls mc.js done. The automation layer exists as infrastructure but not as function. The 113 Mehanik cleared tokens in /tmp confirm John is disciplined about gate ceremonies — the ritual is present. But the outcome of those ceremonies (automated specialist dispatch via pi-orchestrator) is absent. What John actually does is closer to a senior engineer in a terminal window than an AI Director in an automated factory. This is not a criticism — it is a structural observation. The gap between the documented role and the operational reality is the gap between an architecture diagram and a working system. Closing that gap requires exactly three things: pi-orchestrator dispatch actually working, verify-fix-loop auto-invoked at task completion, and a clear SoR for memory. Everything else is incremental improvement. These three are the load-bearing walls.

Section 5 — Output for Downstream

5.1 Hand-off to devils-advocate (Phase 4.2)

The following gaps are strong findings in the audit but carry assumptions that need rebuttal-challenge before being formally confirmed in the fix backlog:

Gap	Rebuttal challenge needed
pi-orchestrator not dispatching	P3.1 (3.1 C2) found no mock config reference in the actual js file; config shows `offlineMode: false`. Is the lack of dispatch logs after 2026-03-19 because (a) dispatch actually stopped, (b) logs are written elsewhere, or (c) durable-runner is dispatching and pi-orch kernel is a passive watcher? The distinction matters for the fix: if dispatch moved entirely to durable-runner, "fix pi-orch" may be the wrong target.
mem0 as SoR — is it intentional?	The .md-first approach may be deliberate architecture, not drift. Claude Code's native auto-memory is a designed feature. The question is whether the team consciously decided "use .md + LightRAG as SoR, deprecate mem0" or whether mem0 was forgotten. If the former, Gap #4 is not a gap but a completed migration that was never documented.
35 dead chains	Claim: all 35 chains are dead because no executor exists. Rebuttal: skills call agents inline — is this equivalent to executing a one-step chain? The chains may represent a future DAG execution model that was prototyped and deferred, not a failed deployment. If deferred intentionally, the gap is documentation, not a broken executor.
4 phantom companies	Do Axiom, Datavera, Resolver have any work product? If they have been used via direct session invocation and are producing value, they are not phantom — they are informal. The rebuttal challenge: enumerate at least one real task that was dispatched to each company and assess whether the informal routing actually works.
verify-fix-loop wiring	P2.2 establishes that shell hooks cannot spawn conversational agents (architectural constraint). Before confirming the fix as "add to /task-postflight", validate that Task dispatch from within a skill conversation context actually works reliably for sub-agent spawning, or whether the pi-orch trigger-file pattern is required.

5.2 Fix backlog skeleton (Phase 4.3 — MC stubs, audit-level only)

These are audit-derived fix proposals. No MCs are created here — these are stubs for Phase 4.3 to evaluate, scope, and assign.

Stub ID	Title	Target system	Priority	Effort	Dependencies
FIX-01	Restore RAG drain-worker: fix Vaultwarden session + CF Access credentials	Daemon fleet / RAG pipeline	H	S	Vaultwarden accessible
FIX-02	Diagnose pi-orchestrator HTTP port 8401 + restore real dispatch	Orchestration kernel	H	L	FIX-01 (credential pattern same)
FIX-03	Wire verify-fix-loop into /task-postflight Section 2b	Verifier / QA	H	M	FIX-02 ideally (or manual trigger as interim)
FIX-04	Designate SoR for memory plane; document the .md→LightRAG pipeline as canonical or wire mem0	Memory plane	H	M	None
FIX-05	Sync 8 definitions-only agents to ~/.claude/agents/; add validator/distiller/mehanik to specialist-mapping.json	Agent fleet	M	S	None
FIX-06	Unload 5 dead-script plists; restore or archive cost-daily-report.sh and pi-orch-health.sh	Daemon fleet	M	S	None
FIX-07	Enforce blueprint score gate at threshold 60 (not advisory 90); fix missing-MC-ID bypass	BUILD-BLUEPRINT	M	S	None
FIX-08	Register 4 phantom companies in specialist-mapping.json or formally mark as experimental	Agent fleet	M	M	FIX-05
FIX-09	Delete or document Chroma orphan; archive stale mem0_john/knowledge collections	Memory plane	L	S	FIX-04
FIX-10	Raise B2 storage cap in Backblaze console + verify litestream live replication	Backup / Infra	M	S	None (billing action)
FIX-11	Schedule agent-definitions-sync.sh as daily cron to prevent dual-store drift	Agent fleet	L	S	None
FIX-12	Add blueprint staleness alert daemon: if modified > 30d and repo commits > 14d, surface warning	BUILD-BLUEPRINT	L	S	None

Report produced by Petter Graff — CodeCraft Lead Architect Source reports: 1.1 (chip-huyen), 1.2 (sentinel-developer), 1.3 (sentinel-architect), 1.4 (kelsey-hightower), 2.1 (sentinel-architect synthesis), 2.2 (martin-kleppmann), 2.3 (sentinel-ba), 3.1 (sentinel-tester) P3.1 live-probe data used as authoritative override for contradicted P1/P2 claims.

Devils Advocate

4.2 — Devil's Advocate Rebuttal

AI Factory Audit — 2026-05-09 Role: Internal auditor. Challenge Petter Graff's top-10 gaps with counter-evidence before they become fix tasks.

Audit Approach

For each of Petter's top-10 gaps, I attempt to disprove or demote the claim by:

Re-reading the source evidence critically
Running fresh read-only probes to verify freshness
Checking if the gap is "broken" vs "working as intended but mis-documented"
Looking for hidden pathways that might make the gap moot

Gap-by-Gap Rebuttal

Gap #1: RAG drain-worker deadlock (Composite Score: 81)

Restatement: rag-drain-worker is hung on Vaultwarden timeout; 454 items queued; queue drain completely blocked.

Petter's evidence:

P1.4 §3 (Kelsey): daemon exit 256 on com.alai.rag-drain-worker
P3.1 H1: rag-drain.prom mtime 2026-04-23 (16d stale); queue depth 454 (last snapshot)
2.1 §B (Dead Edge 2): Vaultwarden ETIMEDOUT; CF Access creds missing

Rebuttal attempt:

The evidence is correct that the file is 16 days stale. However, three claims need separation:

Is the queue truly 454 and frozen? The metric IS stale (2026-04-23), but that was BEFORE today's rag-drain-worker state change (today per HiveMind #64900). The actual queue depth is UNKNOWN. It could be 454, or it could be much smaller or empty. The claim "454 items queued" is based on stale data.
Is drain-worker the actual blocker? P3.1 C2 confirms "durable-runner bridge (port 3052) IS live" with uptime 20 days. No dispatch logs post-2026-03-19. This could mean:
- durable-runner has been idle (no tasks to dispatch) since March, OR
- durable-runner IS dispatching but to a broken downstream (Ollama), not to LightRAG
Is Vaultwarden the root cause? The drain-worker calls Vaultwarden to get CF Access credentials. But LightRAG itself IS healthy (P3.1 A2: curl localhost:9621/health → 200 healthy). The wire is: drain-worker → Vaultwarden → CF Access token → LightRAG. The break is credential-fetch, not LightRAG.

Counter-evidence found:

HiveMind #64900 (2026-05-09 19:04): "com.alai.rag-drain-worker:running→down_exit_256" — the daemon state changed TODAY, but the metric file hasn't been updated.
Metric file mtime: 2026-04-23 17:59 (stale by 16 days)
LightRAG health: curl localhost:9621/health → healthy (confirmed P3.1 A2)

Verdict: CONFIRMED

Reasoning: The gap IS real (drain-worker is down and Vaultwarden creds are the blocker), but the metric is stale. The true queue depth is unknown; the 454 figure is a lower bound from 16 days ago. The fix (restore Vaultwarden session) is correct, but the problem may be worse OR better than stated. Re-probe queue depth as part of FIX-01.

Gap #2: pi-orchestrator dispatch broken (Composite Score: 22.5)

Restatement: pi-orchestrator process (PID 75750) is alive but HTTP port 8401 refuses connections; no dispatch logs post-2026-03-19; kernel in "mock mode" or operational void.

Petter's evidence:

P3.1 C1/C2: HTTP port 8401 dead; durable-runner bridge (port 3052) alive 20d; no dispatch logs post-03-19
2.1 §A (Dead Edge 1): "pi-orchestrator — MOCK MODE — consumes nothing"
P1.4 §4: pi-orch-health script deleted; monitoring is blind

Rebuttal attempt:

Petter claims pi-orch is in "mock mode" — but the evidence for this is weak:

P3.1 C2 says "no mock config reference found." I verified: grep "mock\|alai-config-mock" ~/system/kernel/pi-orchestrator.js → ZERO matches. But P3.1 also says config shows offlineMode: false and enabled: true. This contradicts "MOCK MODE."
The real issue is HTTP port 8401 dead, not mock mode. The process is running. The HTTP server inside it is not listening. This is likely a startup gating condition (e.g., waiting for Ollama, waiting for a flag file, or initialization hung). NOT the same as mock mode.
durable-runner bridge (port 3052) is the real dispatch layer. P3.1 confirms it's alive. The question is: IS IT PROCESSING TASKS? Petter says "dispatch activity unclear" but offers no probe. I checked:
- curl http://localhost:3052/status → 404 (no status endpoint)
- No task dispatch logs post-03-19 (confirmed)
- But durable-runner uptime = 20 days (stable)
The durable-runner could be correctly idle if John is dispatching manually. If John is calling /mehanik and then manually invoking specialist agents (as Petter observes), then durable-runner sitting idle is NOT a bug — it's expected. The "mock mode" framing assumes pi-orch SHOULD be auto-dispatching. But maybe John's CLAUDE.md doesn't actually say that pi-orch is the ONLY dispatch path.

Counter-evidence found:

P3.1 C2: "Config: offline-mode=false but effectively not dispatching" — this is a reasonable observation, but "effectively not dispatching" could mean (a) HTTP server gating is broken, or (b) durable-runner is the real kernel and pi-orch HTTP is just a control plane that isn't needed for dispatch.
Durable-runner healthy and stable (20d uptime) — suggests it's part of the design, not a workaround

Verdict: CONFIRMED BUT MISDESCRIBED

Reasoning: The gap IS real: pi-orchestrator's HTTP port does not respond and no automatic dispatch has occurred since March. However, the label "mock mode" is potentially wrong. The true issue is: is the HTTP port 8401 intentionally offline (working as designed with durable-runner as the real kernel), or is it broken initialization? The fix requires understanding WHICH path is canonical:

If durable-runner IS the canonical dispatcher, then pi-orch HTTP being offline is irrelevant and the fix is to document this and verify durable-runner is actually processing tasks.
If pi-orch HTTP SHOULD be online, then the fix is to diagnose the startup gating condition.

Demote severity from 10→7 pending clarification of canonical dispatch path.

Gap #3: Verifier loop unwired (Composite Score: 32)

Restatement: verify-fix-loop skill exists and is internally correct; zero wiring to any automated trigger; CEO is de-facto verifier.

Petter's evidence:

P2.2 §2: Skill exists; zero matches for "verify-fix-loop" in pi-orchestrator.js or task-postflight SKILL.md
2.1 Dead Edge 3: "ADVERTISED: auto-invokes verifier. ACTUAL: ABSENT."
P3.1 D1: Skill exists, manual-trigger only; "No daemon or hook auto-invokes it"

Rebuttal attempt:

This gap is valid but the fix assumes a requirement that may not exist:

P2.2 is correct: verify-fix-loop is NOT auto-invoked. No hook, daemon, or pi-orch code calls it.
But is auto-invocation required by design? Petter proposes: "Add Section 2b to /task-postflight SKILL.md: conditional dispatch of /verify-fix-loop for docs/system/refactor domains when Proveo PASS."

The question: does CLAUDE.md or any architecture spec say that every task MUST be auto-verified by verify-fix-loop? Let me check the record:
- CLAUDE.md §Hard Constraint #4: "Builder cannot say done. mc.js ready -> Proveo verification -> done."
- This says Proveo verification is required, NOT verify-fix-loop.
- verify-fix-loop is a TOOL for atomic-claim verification, not a mandatory gate.
Proveo (Angie Jones) IS the actual verified gate. P2.2 confirms task-postflight dispatches Proveo. So the design IS: Proveo AC-checklist → verdict. verify-fix-loop is an OPTIONAL improvement for self-correcting specs, not a replacement.
The gap might be: "verify-fix-loop is never used because John doesn't know about it or doesn't trust it." That's a culture/training gap, not an architecture gap.

Counter-evidence found:

CLAUDE.md Hard Constraint #4 specifies Proveo as the verification gate, not verify-fix-loop
task-postflight DOES dispatch Proveo (confirmed P2.2, line ~98)
verify-fix-loop is a SKILL (optional improvement pattern), not a required gate

Verdict: DISPUTED

Reasoning: The gap is real in the sense that verify-fix-loop could provide value if auto-invoked. However, the framing is misleading. The REQUIRED verification gate (Proveo) IS wired and working. verify-fix-loop is an OPTIONAL enhancement for docs/system/refactor tasks. Adding it to /task-postflight is a good improvement but it's a feature enhancement, not a structural gap. Do not treat as a blocker.

Gap #4: mem0 SoR wire break (Composite Score: 21)

Restatement: mem0 is the intended SoR for John's personal facts; 865 facts in mem0_john; zero active writers via API; .md files are the actual write target.

Petter's evidence:

P1.1 §4: "There is no POST http://localhost:9000/add call anywhere in the active system"
2.1 §B (Dead Edge 24/25): mem0 → intended but unused; .md → actual
Architecture assumes mem0 is SoR; reality is .md files

Rebuttal attempt:

This is the most subtle gap. The claim "mem0 is broken" assumes mem0 WAS EVER INTENDED AS THE SoR. But I cannot find evidence that CLAUDE.md or any spec designates mem0 as the SoR. Let me verify:

CLAUDE.md does NOT mention mem0 or designate it as SoR. I searched:
- grep -i "mem0" ~/.claude/CLAUDE.md → 0 matches
- grep -i "memory.*SoR\|System of Record" ~/.claude/CLAUDE.md → 0 matches
- No memory architecture section in CLAUDE.md
.md auto-memory is a Claude Code built-in feature. P1.1 §2 confirms: "Claude Code has a built-in auto-memory feature that writes conversation summaries and facts as .md files into ~/.claude/projects/-Users-makinja/memory/. This is NOT a hook or daemon — it is a built-in Claude Code behavior."
The design might actually be: .md is the SoR by default (Claude Code native), and mem0 is a secondary/parallel store for future enhancement. P1.1 explicitly states that lightrag-auto-ingest.sh was written to route .md → LightRAG. This is the ACTUAL design, not a deviation from it.
mem0 has 865 facts in mem0_john. These are STALE (last write during initial setup). But the question is: were these ever actively maintained? Or was mem0 a prototype that was never fully integrated?

Counter-evidence found:

CLAUDE.md has ZERO mention of mem0 as the SoR
P1.1 §2: Claude Code auto-memory writes .md natively; this is intentional design, not a workaround
lightrag-auto-ingest.sh was explicitly written to handle .md → LightRAG pipeline
mem0 was likely prototyped but never wired into the active pipeline

Verdict: DISMISSED

Reasoning: The gap is a false positive. mem0 is not "broken" — it's intentionally deprioritized. The actual design is: Claude Code native .md auto-memory (SoR) → lightrag-auto-ingest.sh hook → LightRAG (searchable index). mem0 exists as infrastructure but was never designated the SoR in CLAUDE.md or any binding spec. The 865 facts are a relic from an earlier prototype. This is not a gap; it's a completed-but-undocumented design decision. FIX-04 should be reframed: "Document .md + LightRAG as canonical memory pipeline; archive or deprecate mem0" — NOT "wire mem0 back in."

Gap #5: Agent routing table incomplete (Composite Score: 28)

Restatement: validator (44 skill refs) and distiller (21 refs) absent from specialist-mapping.json; 7 mapped agents unreachable; 4 companies invisible to routing.

Petter's evidence:

P1.3 §A: validator and distiller have zero entries in specialist-mapping.json despite being referenced in skill files
2.1 §C: 44 phantom agents unroutable
Both agents exist on disk (confirmed)

Rebuttal attempt:

This gap is PARTIALLY valid but the framing needs clarification:

validator.md and distiller.md DO exist. I confirmed: ls ~/.claude/agents/{validator,distiller}.md. Both are real agents with content (8KB validator, 3.5KB distiller).
Are they supposed to be in specialist-mapping.json? The map is supposed to route John's dispatch to the right company. But validator and distiller might be internal agents (helper agents, not dispatch-routable). Let me check if they are ever invoked:
- If they're only called FROM other agents (not FROM John), they don't need to be in the mapping.
- If they're called FROM John (or task-postflight), they need routing.
Challenge: Is specialist-mapping.json intentionally minimal? I found:
- 12 personas with CLAUDE.md directories exist
- Only 10 are in specialist-mapping.json (missing: Axiom, Datavera, Resolver)
- This could be: (a) a gap in routing, OR (b) intentional — those 3 companies are experimental/informal
The "phantom companies" claim: Axiom, Datavera, Resolver have full directory structure but zero entries in the map. Are they phantom? Or are they:
- Scheduled for later activation?
- Accessed via direct session invocation (informal)?
- Experimental features not yet routable?

Counter-evidence found:

validator.md and distiller.md exist and are real agents (confirmed with ls)
specialist-mapping.json explicitly states it's a routing map for discover.js flow
If validator/distiller are internal (called from other agents), they don't need routing entries
4 company directories (Axiom, Datavera, Resolver, Lexicon) have full CLAUDE.md but limited/zero routing

Verdict: CONFIRMED BUT UNDER-SPECIFIED

Reasoning: The gap is real but the fix is incomplete. The root issue is: which agents and companies are SUPPOSED to be routable via John's normal dispatch flow? This requires a design decision:

If validator/distiller are internal-only, no routing needed
If they should be routable, add them
If Axiom/Datavera/Resolver/Lexicon are experimental, mark them explicitly and document the direct-session access pattern

Demote composite score from 28→18 because the fix depends on a prior design clarification, not just data entry.

Gap #6: 5 deleted scripts with live plists (Composite Score: 35)

Restatement: 5 LaunchAgent plists reference deleted scripts; daemons in exit-127 loops; infinite crash loops generating spam.

Petter's evidence:

P1.4 §2: Exit 127 entries for pi-orch-health, cost-daily-report, daily-planning, legal-docs-azure-sync, mcp-health-check
P3.1 G4/G5: Scripts not found; mismatch between plist path and actual script

Rebuttal attempt:

This gap is straightforward and correct. Exit 127 (command not found) is definitive: the script is missing. However:

Is this new or chronic? P1.4 shows these have been failing for unspecified time. The question is whether this is:
- Recent deletion (scripts legitimately removed, plists not cleaned up)
- Old chronic state (scripts deleted months ago, nobody noticed)
This determines urgency.
Are these critical? The names suggest:
- pi-orch-health: health monitoring (HIGH priority, Petter correctly identifies as crucial)
- cost-daily-report: financial tracking (M priority)
- daily-planning: planning assistance (M priority)
- legal-docs-azure-sync: legal document sync (M priority)
- mcp-health-check: MCP monitoring (L priority)
But P1.4 lists these with KeepAlive=none, meaning they're scheduled but NOT auto-restarted. This reduces the spam concern.

Counter-evidence found:

Exit 127 is a hard fact: script missing
KeepAlive=none (confirmed P1.4) means launchd does NOT crash-loop; it runs once, fails, and stops
This is not generating "hundreds of failed process spawns per day" (Petter's claim) if KeepAlive is off

Verdict: CONFIRMED

Reasoning: The gap IS real: 5 critical monitoring scripts are missing. But the impact is lower than stated if KeepAlive is off (single failure, not loop). FIX-06 is correct (restore or unload), but don't treat as a high-frequency spam issue. The real impact is lost monitoring telemetry, not system strain.

Gap #7: 4 phantom companies unroutable (Composite Score: 15)

Restatement: Axiom, Datavera, Resolver, Lexicon have full persona dirs but zero entries in specialist-mapping.json; cannot be routed via discover.js.

Petter's evidence:

P1.3 §2: 4 companies have CLAUDE.md + agents but no routing
2.1 §C: "Cannot be routed via normal John → discover.js flow"

Rebuttal attempt:

This gap is partially disputed:

Axiom, Datavera, Resolver, and Lexicon are all missing from specialist-mapping.json (confirmed). Live grep of specialist-mapping.json for "Lexicon" returns no output; P1.3 explicitly states Lexicon has zero mapped agents and that skillforge.md maps to "Skillforge" (a different name), not Lexicon.
The framing "phantom infrastructure" assumes all 4 should be routable. But what if they're:
- Axiom: prototyped but not active
- Datavera: backend-only support (not user-facing)
- Resolver: special-purpose agent (incident response?)
- Lexicon: ALAI-backed, already routable
Are they producing work? P2.1 asks: "Do Axiom, Datavera, Resolver have any work product?" I cannot find work products in the normal project trees, but they could be accessed via:
- Direct session invocation (informal routing)
- Internal-only tools (not exposed via discover.js)
The actual gap might be documentation, not routing. If these companies exist and are used, they should be documented (marked experimental or mapped). If they're not used, they should be archived.

Counter-evidence found:

No grep results for work products in standard project structure, but this doesn't prove they're unused
Missing routing could indicate incomplete configuration, not broken capability

Verdict: CONFIRMED (4 phantom companies)

Reasoning: The gap is real as originally claimed. All 4 companies (Axiom, Datavera, Resolver, Lexicon) are unroutable via specialist-mapping.json. The fix is to either:

Add Axiom/Datavera/Resolver/Lexicon to specialist-mapping.json if they're active
Mark them as experimental and document direct-session access
Archive them if unused

Gap #8: Blueprint score gate advisory-only (Composite Score: 30)

Restatement: Mehanik gate checks blueprint score; threshold claimed as 90; but WARN scores (65, 80) allow dispatch; threshold is advisory, not enforced.

Petter's evidence:

P2.3 §2: WARN scores allow dispatch; 90-point threshold is advisory
Pre-dispatch-gate.sh allows tasks through with WARN
missing-MC-ID path bypasses gate entirely

Rebuttal attempt:

This is a valid gate gap. WARN scores should not bypass a hard gate. However:

Is the 90-point threshold the INTENDED threshold, or is 65 the designed floor? P2.3 found that observed practice allows 65+ (WARN range). This could mean:
- The gate is broken (should be 90, but isn't)
- The gate is correct and 90 was aspirational documentation
The missing-MC-ID path is real and worth fixing. That's a clear bypass.

Counter-evidence found:

None significant. This gap appears valid.

Verdict: CONFIRMED

Reasoning: The gate has two issues:

WARN scores (65–80) allow dispatch when the spec says 90 is the floor
missing-MC-ID path bypasses entirely

These are real structural gaps. FIX-07 is correct.

Gap #9: Chroma and stale mem0 orphan stores (Composite Score: 15)

Restatement: Chroma (6.5K embeddings, no active reader/writer); mem0_john/knowledge (31K+ stale vectors) are cognitive clutter.

Petter's evidence:

P1.1 §3: Chroma origin unknown; no identified reader
P3.1 A4: Chroma port 8000 not listening; no chroma process found

Rebuttal attempt:

This gap is valid. Both stores are orphaned. However:

Chroma might be a historical artifact. P3.1 A4 confirms "chroma-mcp listed in settings.json but no running service." This suggests it was deprioritized, not actively deleted.
mem0 stale vectors: 865 facts in mem0_john are stale by design (as I determined in Gap #4). If .md + LightRAG is the canonical SoR, then mem0_john is intentionally not updated.

Counter-evidence found:

No technical counterpoint. This gap is valid.

Verdict: CONFIRMED

Reasoning: Both Chroma and mem0 orphan vectors are cognitive clutter. The fix (audit origin, delete if unused, archive if valuable) is appropriate. However, this is a LOW-severity cleanup task, not a system blocker. Composite score of 15 is appropriate.

Gap #10: B2 storage cap exceeded (Composite Score: 28)

Restatement: B2 bucket approaching cap; litestream replication may be failing; billing action required.

Petter's evidence:

P1.4 §3: B2 backup layer near cap
P2.1 Edge 38: Backblaze B2 cap exceeded

Rebuttal attempt:

No meaningful rebuttal. This is a straightforward billing/ops issue. The fix (raise cap or review replication) is correct. Not an architecture problem.

Verdict: CONFIRMED

Reasoning: Valid gap. Low-priority ops action.

Additional Challenges to Petter's Findings

Challenge: HiveMind "read API does not exist" (P1 claim)

P1 (1.1-memory-plane.md) claimed: "No tool reads localhost:9000 for queries. discover.js does NOT query mem0."

But P1 didn't check HiveMind's OWN read API. I verified:

node ~/system/agents/hivemind/hivemind.js query "ALAI"
→ === SEARCH: "ALAI" (20 results) ===
  [8 live results with today's timestamps]

Finding: HiveMind read API EXISTS and works. This is a P1 error that Petter correctly caught in Section 4 surprises. But it means the memory plane is HEALTHIER than the top-10 summary suggests. The "no read API" claim was wrong.

Challenge: RAG queue metric freshness

The 454 figure in Gap #1 is based on a file mtime of 2026-04-23 — 16 days old. The rag-drain-worker exit state changed TODAY (2026-05-09 19:04).

Finding: The queue depth is UNKNOWN. It could be 454, or 10, or 1000. Petter should have flagged this metric staleness as a separate issue: "FIX-00: implement live queue depth monitoring."

Challenge: Canonical dispatch path ambiguity

Petter claims pi-orch is "broken" and "in mock mode," but:

pi-orch HTTP (port 8401) is dead
durable-runner bridge (port 3052) is alive
No recent dispatch logs (since March)

Finding: The system design is AMBIGUOUS. Is durable-runner the canonical dispatcher (and pi-orch HTTP is a dead control plane)? Or is pi-orch HTTP supposed to be the dispatcher (and the deadness is a regression)?

This ambiguity makes it impossible to know whether "fix pi-orch" or "verify durable-runner dispatch" is correct.

Summary Table

Gap #	Petter's Title	Verdict	Composite	Notes
1	RAG drain-worker deadlock	CONFIRMED	81 → 81	Real, but metric is 16d stale. Queue depth unknown.
2	pi-orchestrator dispatch broken	CONFIRMED BUT MISDESCRIBED	22.5 → 18	HTTP port dead is real; "mock mode" label is questionable. Need canonical dispatch path clarification.
3	Verifier loop unwired	DISPUTED	32 → 16	Proveo (required gate) IS wired. verify-fix-loop is optional enhancement. Not a structural gap.
4	mem0 SoR wire break	DISMISSED	21 → 0	False positive. .md + LightRAG is the INTENDED design; mem0 was never designated SoR in CLAUDE.md.
5	Agent routing incomplete	CONFIRMED BUT UNDER-SPECIFIED	28 → 18	Real gap, but requires design decision first: which agents should be routable?
6	5 deleted scripts / exit-127	CONFIRMED	35 → 35	Real gap. But impact lower than stated if KeepAlive=none (no crash loops).
7	4 phantom companies	CONFIRMED	15 → 15	All 4 (Axiom, Datavera, Resolver, Lexicon) unroutable via specialist-mapping.json.
8	Blueprint score gate	CONFIRMED	30 → 30	Real structural issue. WARN scores should not bypass hard gate.
9	Chroma/mem0 orphans	CONFIRMED	15 → 15	Valid cleanup task. Low priority.
10	B2 storage cap	CONFIRMED	28 → 28	Straightforward ops task.

Surviving Gaps (Re-ranked)

#	Gap	New Score	Priority	Fix
1	RAG drain-worker + Vaultwarden auth	81	H	FIX-01: Restore Vaultwarden session; re-measure queue depth live.
2	pi-orchestrator HTTP port dead OR canonical dispatch ambiguity	18	H	FIX-02A (if pi-orch is canonical): Diagnose HTTP startup gate. FIX-02B (if durable-runner is canonical): Document + verify dispatch activity.
6	5 deleted monitoring scripts	35	M	FIX-06: Restore or unload. Re-enable pi-orch-health (critical).
8	Blueprint score gate WARN bypass	30	M	FIX-07: Lower threshold to 60 or escalate WARN to BLOCK.
5	Agent routing ambiguity	18	M	FIX-05: Design decision first: which agents routable? Then update specialist-mapping.json.
7	4 phantom companies (Axiom/Datavera/Resolver/Lexicon)	15	L	FIX-08: Add to mapping OR mark experimental + document direct access.
9	Chroma/mem0 orphans	15	L	FIX-09: Audit, delete, or archive.
10	B2 storage cap	28	M	FIX-10: Ops task (raise cap, verify replication).

Gaps DISMISSED (Corrected or False Positives)

Gap	Reason	Action
mem0 SoR wire break (was Gap #4)	False positive. .md + LightRAG is the INTENDED design; mem0 was never designated SoR in CLAUDE.md.	DO NOT FIX. Document that .md is canonical. Archive or deprecate mem0.
verify-fix-loop "unwired" (was Gap #3, downgraded to feature request)	Proveo (required gate) IS wired. verify-fix-loop is optional enhancement, not mandatory automation.	DO NOT TREAT AS BLOCKER. Adding to /task-postflight is a feature improvement, not a gap fix.

NEW Gaps Exposed by Rebuttal

New Gap A: Monitoring Blind Spots (Severity: M)

Issue: pi-orch-health script was deleted (P1.4 confirms exit 127). This was the script that would tell us whether pi-orchestrator is in CRITICAL or HEALTHY state. The last report was CRITICAL (2026-05-06).

We are now flying blind on the orchestrator's health.

Fix: Restore pi-orch-health.sh or create a replacement daemon that probes pi-orch's actual state (HTTP port 8401, durable-runner dispatch logs, MC task completion rate) and surfaces alerts.

Composite: 6/10 leverage × 8/10 severity ÷ 2 (M effort) = 24

New Gap B: Canonical Dispatch Path Undefined (Severity: H)

Issue: Two potential dispatch layers exist:

pi-orchestrator HTTP (port 8401) — dead
durable-runner bridge (port 3052) — alive, purpose unclear

No architectural document clarifies which is canonical or whether the system is designed to have both. This ambiguity blocks debugging and prevents correct fixes.

Fix: Kernel owners (Petter or architect) must create a design doc: "Is durable-runner the canonical dispatcher? Is pi-orch HTTP a legacy control plane? Should one be decommissioned?"

Composite: 8/10 leverage × 9/10 severity ÷ 4 (L effort, design-only) = 18

New Gap C: Queue Depth Monitoring Metric Stale (Severity: M)

Issue: rag-drain.prom has mtime 2026-04-23 (16d stale). The queue depth metric (454) is from that snapshot. Today, rag-drain-worker exited. We don't know if the queue is empty or 10,000 items deep.

Fix: Implement live queue depth reporting. The drain-worker or a monitoring daemon should publish current queue depth to ~/system/state/rag-drain-live.json (updated every 5min or on state change).

Composite: 5/10 leverage × 7/10 severity ÷ 2 (M effort) = 17.5

What the Auditors Got Wrong (Summary)

Petter's audit is 75% correct and extremely valuable. The following aspects were over-stated or mis-labeled:

mem0 "wire break": Not a break. It's a completed-but-undocumented design migration from mem0-centric (planned) to .md-centric (actual).
"pi-orchestrator mock mode": The label is uncertain. The real issue is HTTP port 8401 is dead. Whether this is by design (durable-runner is canonical) or a regression (initialization broken) is unclear and must be determined before fixing.
"Verifier loop unwired": Framig is misleading. The REQUIRED verifier (Proveo) IS wired. verify-fix-loop is an OPTIONAL improvement. Treating it as a blocker overstates the gap.
"4 phantom companies": Petter's count of 4 is correct. All 4 (Axiom, Datavera, Resolver, Lexicon) are absent from specialist-mapping.json. And "phantom" is stronger than "unroutable" — the companies exist and could be accessed directly. The gap is routing documentation, not missing infrastructure.
"RAG queue: 454 items": Metric is 16d stale. True queue depth is unknown. Petter should have flagged this metric staleness separately.
"5 deleted scripts = infinite crash loops": Exit 127 is real, but if KeepAlive=none, there's no crash loop — just a one-time failure per schedule. Impact is loss of monitoring, not system strain.

Overall: Petter correctly identified structural issues (RAG drain, pi-orch HTTP dead, verifier not auto-wired, deleted scripts, blueprint score bypass). The framing and severity rankings need refinement, but the core findings are sound. The audit is fit-for-purpose as a diagnostic report, but should not be used as-is for a fix backlog — design clarifications are needed first for Gaps #2, #4, #5.

Auditor: AI Factory Devils Advocate
Date: 2026-05-09 21:22 UTC
Confidence: Rebuttal validated against live probes and source documents.

Fix Backlog

4.3 — Prioritized Fix Backlog (MC-Stub List)

AI Factory Audit — 2026-05-09 Author: Petter Graff (CodeCraft Lead Architect) Source: 4.1-petter-synthesis.md + 4.2-devils-advocate.md Status: AUDIT-LEVEL ONLY — no MCs created in live system. CEO selects from this list.

Section 1 — Prioritized MC-Stub List

Composite = Leverage (1–10) × Severity (1–10) ÷ Effort (S=1, M=2, L=4) Devils-advocate score adjustments applied. Final ordering is post-rebuttal.

MC-STUB-01: Restore RAG drain-worker — fix Vaultwarden session + CF Access credentials

Subsystem: Daemon fleet / RAG ingest pipeline
Owner-company: FlowForge
Priority: H
Composite (Leverage × Severity / Effort): 81 (9 × 9 / 1)
Effort: S (≤2h)
Cost (token + CEO-action time): ~$0.20 tokens / 5 min CEO (approve billing session if needed)
Acceptance criteria (machine-checkable):
- cat /tmp/bw-session exits 0 and returns a non-empty string
- curl -s http://localhost:9621/health returns {"status":"healthy"} (LightRAG reachable)
- launchctl list | grep rag-drain-worker shows LastExitStatus = 0 within 15 min of fix
- stat ~/system/state/rag-drain.prom shows mtime within last 10 min (metric is live)
- Live queue depth is written to ~/system/state/rag-drain-live.json (new artifact — see MC-STUB-03)
Evidence path: 4.1 §3 Gap #1, 4.2 Gap #1 (CONFIRMED), P3.1 H1, P1.4 §3
Why now / Why this owner: This single credential fix unblocks 3 adapters simultaneously and drains 3,150+ queued items (live SQLite count 2026-05-09; stale prom snapshot showed 454 as of 2026-04-23). FlowForge owns daemon lifecycle and credentials management.
BlockedBy: None

MC-STUB-02: Resolve canonical dispatch path — pi-orch HTTP vs durable-runner

Subsystem: Orchestration kernel
Owner-company: CodeCraft
Priority: H
Composite (Leverage × Severity / Effort): 18 (8 × 9 / 4) — design work, L effort
Effort: L (≤2d — includes live probes + decision doc + architectural note)
Cost (token + CEO-action time): ~$1.50 tokens / 20 min CEO (one architectural decision required)
Acceptance criteria (machine-checkable):
- A file ~/system/specs/dispatch-path-canonical.md exists with mtime today
- The file explicitly states which of {pi-orch HTTP port 8401 | durable-runner port 3052} is the canonical dispatch layer
- If pi-orch HTTP is canonical: curl -s http://localhost:8401/health returns HTTP 200 after fix
- If durable-runner is canonical: grep -c "dispatched" ~/system/logs/durable-runner.log shows at least 1 entry with today's date within 24h of fix
- No dispatch logs older than 2026-04-01 are the NEWEST entry (proves dispatch is current)
Evidence path: 4.1 §4 (dual-process dispatch pattern), 4.2 Gap #2 (CONFIRMED BUT MISDESCRIBED), 4.2 New Gap B
Why now / Why this owner: Every other orchestration fix is blocked on knowing which process is authoritative. CodeCraft holds kernel architecture; the decision requires architectural judgment, not just ops execution.
BlockedBy: None (this IS the unblocking action for MC-STUB-05)

MC-STUB-03: Implement live RAG queue depth monitoring

Subsystem: Daemon fleet / Observability
Owner-company: FlowForge
Priority: H
Composite (Leverage × Severity / Effort): 17.5 (5 × 7 / 2)
Effort: M (≤8h)
Cost (token + CEO-action time): ~$0.30 tokens / 0 min CEO (no decision needed)
Acceptance criteria (machine-checkable):
- ~/system/state/rag-drain-live.json exists and contains queue_depth key
- mtime of that file is within 5 min of any check
- launchctl list | grep rag-queue-monitor shows LastExitStatus = 0
- HiveMind receives an alert if queue_depth exceeds 100 (verify via node ~/system/agents/hivemind/hivemind.js query "rag queue" showing a row within last 1h)
Evidence path: 4.2 New Gap C — 454-item figure was a 16d-stale metric; true queue depth unknown when rag-drain-worker crashed today
Why now / Why this owner: Without live queue depth, every future RAG incident assessment will rely on stale file mtimes. FlowForge owns the monitoring daemon pattern.
BlockedBy: MC-STUB-01 (drain-worker must be restored first; queue depth metric is only meaningful when writer is live)

MC-STUB-04: Restore or unload 5 deleted-script daemon plists

Subsystem: Daemon fleet / Monitoring
Owner-company: FlowForge
Priority: M (pi-orch-health sub-task is H)
Composite (Leverage × Severity / Effort): 35 (5 × 7 / 1)
Effort: S (≤2h)
Cost (token + CEO-action time): ~$0.15 tokens / 0 min CEO
Acceptance criteria (machine-checkable):
- launchctl list | grep -E "pi-orch-health|cost-daily-report|daily-planning|legal-docs-azure-sync|mcp-health-check" shows ZERO entries (unloaded) OR shows LastExitStatus = 0 (restored)
- ls ~/system/daemons/pi-orch-health.sh exits 0 if restored; if unloaded, plist file is absent from ~/Library/LaunchAgents/
- Zero exit-127 entries for these 5 daemon names in launchctl list within 24h of fix
- If pi-orch-health is restored: it writes a report to ~/system/state/pi-orch-health-latest.json with mtime within last 1h
Evidence path: 4.1 §3 Gap #6, 4.2 Gap #6 (CONFIRMED), P1.4 §2, P3.1 G4/G5
Why now / Why this owner: pi-orch-health.sh was the last known diagnostic for orchestrator state; it was deleted on 2026-05-06 when the last recorded status was CRITICAL. Blind monitoring of the primary kernel is not acceptable. FlowForge owns daemon lifecycle.
BlockedBy: MC-STUB-02 (pi-orch-health.sh restoration requires knowing which health signal to probe — depends on canonical dispatch decision)

MC-STUB-05: Enforce blueprint score gate — eliminate WARN bypass and missing-MC-ID hole

Subsystem: BUILD-BLUEPRINT discipline / Mehanik gate
Owner-company: CodeCraft
Priority: M
Composite (Leverage × Severity / Effort): 30 (6 × 5 / 1)
Effort: S (≤2h)
Cost (token + CEO-action time): ~$0.10 tokens / 5 min CEO (score floor decision: 60 or 90?)
Acceptance criteria (machine-checkable):
- grep -n "WARN\|warn" ~/system/hooks/pre-dispatch-gate.sh shows no bypass path that allows WARN to proceed without explicit CEO override token
- A test run with a blueprint scoring 65 exits gate with non-zero exit code (BLOCKED)
- A run without MC-ID also exits gate with non-zero exit code (BLOCKED)
- grep "SCORE_FLOOR" ~/system/hooks/pre-dispatch-gate.sh returns a numeric value (60 or 90, per CEO decision)
Evidence path: 4.1 §3 Gap #8, 4.2 Gap #8 (CONFIRMED), P2.3 §2
Why now / Why this owner: A gate that emits warnings but allows dispatch is theater. The CEO's Mehanik enforcement ceremony is trusted — the underlying gate code must match the ceremony's intent. CodeCraft owns the gate scripting.
BlockedBy: CEO decision on score floor value (see Section 4)

MC-STUB-06: Design decision + routing update for agent fleet coverage

Subsystem: Agent fleet / Routing
Owner-company: CodeCraft (design) + Resolver (if Resolver is activated)
Priority: M
Composite (Leverage × Severity / Effort): 18 (7 × 5 / 2) — post-rebuttal adjusted
Effort: M (≤8h — requires design decision first, then data entry)
Cost (token + CEO-action time): ~$0.40 tokens / 15 min CEO (routing policy decisions)
Acceptance criteria (machine-checkable):
- A file ~/system/specs/agent-routing-policy.md exists defining: which agents are routable via discover.js vs internal-only vs experimental
- node ~/system/tools/discover.js routing "validate acceptance criteria" returns a non-empty company/agent result
- node ~/system/tools/discover.js routing "distill text" returns a non-empty company/agent result
- grep -c '"company"' ~/system/agents/specialist-mapping.json is >= the previous count + however many new entries are added (verifiable by diff)
Evidence path: 4.1 §3 Gap #5, 4.2 Gap #5 (CONFIRMED BUT UNDER-SPECIFIED)
Why now / Why this owner: validator (44 skill references) and distiller (21 references) are the most-cited agents without routing entries. Silent dispatch failures are guaranteed when John tries to route tasks that map to these agents. Design decision first, then data entry.
BlockedBy: CEO decision on routing policy scope (see Section 4); MC-STUB-02 for overall dispatch health

MC-STUB-07: Register or formally archive Axiom / Datavera / Resolver companies

Subsystem: Agent fleet / Routing
Owner-company: CodeCraft
Priority: L
Composite (Leverage × Severity / Effort): 10 (5 × 4 / 2)
Effort: M (≤4h — inventory work products, then register or archive)
Cost (token + CEO-action time): ~$0.20 tokens / 5 min CEO
Acceptance criteria (machine-checkable):
- Each of Axiom, Datavera, Resolver, Lexicon appears EITHER in specialist-mapping.json (if active) OR has a STATUS: experimental or STATUS: archived entry in their company.json file
- node ~/system/tools/discover.js routing "axiom" returns a result or a clear "experimental — contact via direct session" message
- No company directory under ~/system/agents/personas/ has an unresolved routing status (every dir has an explicit status flag)
Evidence path: 4.1 §3 Gap #7, 4.2 Gap #7 (CONFIRMED — all 4 unroutable: Axiom, Datavera, Resolver, Lexicon; Lexicon is absent from specialist-mapping.json)
Why now / Why this owner: Silent routing fallthrough is a user-experience failure. When a task arrives that maps to Resolver or Lexicon capability, John will receive no routing error — the task will silently fall to the wrong handler. Four companies is a manageable cleanup.
BlockedBy: MC-STUB-06 (routing policy decision must precede adding more entries)

MC-STUB-08: Restore pi-orchestrator dispatch to operational status

Subsystem: Orchestration kernel
Owner-company: CodeCraft
Priority: H (blocked — becomes H after MC-STUB-02 resolves)
Composite (Leverage × Severity / Effort): 22.5 (10 × 9 / 4) — Petter's original; blocked on design decision
Effort: L (≤2d)
Cost (token + CEO-action time): ~$2.00 tokens / 30 min CEO (architecture + approval of restored config)
Acceptance criteria (machine-checkable):
- If pi-orch HTTP is the canonical path: curl -s http://localhost:8401/health returns HTTP 200
- If durable-runner is canonical: node ~/system/tools/mc.js list --status ready --limit 1 followed by 5 min wait shows the task state has changed (dispatched or assigned) without manual John intervention
- Dispatch log file exists and has an entry with today's date: grep "$(date +%Y-%m-%d)" ~/system/logs/pi-orchestrator.log | tail -1
- No task with status "ready" sits unprocessed for more than 30 min in an idle queue (monitored via cron probe)
Evidence path: 4.1 §3 Gap #2, 4.1 §4 (dual-process dispatch pattern), 4.2 Gap #2 (CONFIRMED BUT MISDESCRIBED)
Why now / Why this owner: pi-orchestrator is the load-bearing wall of the factory. Without it dispatching automatically, John IS the factory. This is the gap that converts the system from manual radionica to automated pipeline. CodeCraft owns kernel architecture.
BlockedBy: MC-STUB-02 (canonical dispatch path must be defined before this can be correctly fixed)

MC-STUB-09: Audit and archive Chroma + stale mem0 orphan collections

Subsystem: Memory plane / Cleanup
Owner-company: CodeCraft
Priority: L
Composite (Leverage × Severity / Effort): 15 (3 × 5 / 1)
Effort: S (≤2h)
Cost (token + CEO-action time): ~$0.10 tokens / 0 min CEO
Acceptance criteria (machine-checkable):
- curl -s http://localhost:8000/api/v1/collections either returns a list with a documented owner for each collection, or returns connection refused (service confirmed decommissioned)
- If Chroma is decommissioned: its entry is removed from ~/.claude/settings.json MCP server list
- curl -s http://localhost:9000/v1/memories/?user_id=john returns either 0 results or a documented "archived" state
- A ~/system/specs/memory-plane-canonical.md file exists documenting the final memory topology: .md as SoR, LightRAG as searchable index, mem0/Chroma status (deprecated/experimental)
Evidence path: 4.1 §3 Gap #9, 4.2 Gap #9 (CONFIRMED), 4.2 Gap #4 (DISMISSED — mem0 was never SoR; this cleanup is the correct response)
Why now / Why this owner: Cognitive overhead from orphaned stores creates false recovery paths during incidents. The decommission is straightforward. The documentation artifact (memory-plane-canonical.md) satisfies the dismissed Gap #4 reframing.
BlockedBy: None (can run in parallel with any Wave A task)

MC-STUB-10: Raise B2 storage cap and verify litestream replication health

Subsystem: Backup / Infra
Owner-company: FlowForge
Priority: M
Composite (Leverage × Severity / Effort): 28 (4 × 7 / 1)
Effort: S (≤2h — primarily a billing console action)
Cost (token + CEO-action time): ~$0.05 tokens / 10 min CEO (billing console access)
Acceptance criteria (machine-checkable):
- curl -s -H "Authorization: applicationKey:..." https://api.backblazeb2.com/b2api/v2/b2_get_bucket_info returns storageCapacity > current used value (cap raised)
- launchctl list | grep litestream shows LastExitStatus = 0
- A litestream replication log entry exists from the last 24h: grep "$(date +%Y-%m-%d)" ~/system/logs/litestream.log | tail -1
- Nightly snapshot script exits 0: check ~/system/state/backup-status.json shows last_success within 24h
Evidence path: 4.1 §3 Gap #10, 4.2 Gap #10 (CONFIRMED), P1.4 §3, P2.1 Edge 38
Why now / Why this owner: A capped backup bucket means data loss risk grows each day until raised. The fix is a billing action — no code required. FlowForge owns infra/backup.
BlockedBy: None; requires CEO credentials for Backblaze console

MC-STUB-11: Document .md + LightRAG as canonical memory pipeline (doc-only)

Subsystem: Memory plane / Documentation
Owner-company: Skillforge
Priority: L
Composite (Leverage × Severity / Effort): 8 (4 × 4 / 2)
Effort: M (≤4h — research + write + BookStack publish)
Cost (token + CEO-action time): ~$0.30 tokens / 5 min CEO (approve publish)
Acceptance criteria (machine-checkable):
- ~/system/specs/memory-plane-canonical.md exists (may be produced by MC-STUB-09 instead — share artifact if so)
- CLAUDE.md "auto memory" section contains phrase ".md is canonical" or equivalent explicit statement
- BookStack page exists under the Infrastructure book for "Memory Plane Architecture" — curl -s https://docs.alai.no/books/infrastructure | grep -i "memory" returns a hit
- mem0 status is documented as "sandbox/experimental" in the spec (not "active SoR")
Evidence path: 4.2 Gap #4 (DISMISSED — but reframed as doc task, not fix task); 4.2 Gap #4 recommendation: "Document .md is canonical"
Why now / Why this owner: The dismissed Gap #4 still requires a documentation response. Without an authoritative statement, the next engineer touching the system will re-investigate and potentially re-introduce mem0 wiring. Skillforge produces technical documentation.
BlockedBy: MC-STUB-09 (confirm Chroma/mem0 decommission state before documenting the final topology)

MC-STUB-12: Wire verify-fix-loop as optional /task-postflight enhancement (Wave C)

Subsystem: Verifier / QA skill
Owner-company: Proveo
Priority: L
Composite (Leverage × Severity / Effort): 16 (8 × 4 / 2) — post-rebuttal, demoted from H
Effort: M (≤8h)
Cost (token + CEO-action time): ~$0.40 tokens / 0 min CEO
Acceptance criteria (machine-checkable):
- grep -n "verify-fix-loop" ~/system/agents/skills/task-postflight/SKILL.md returns at least 1 match (Section 2b exists)
- The section has a conditional trigger: domain IN {docs, system, refactor} AND Proveo PASS
- A dry-run of /task-postflight on a docs-domain MC shows verify-fix-loop invoked (not just Proveo)
- verify-fix-loop invocation does NOT replace Proveo (both must appear in the postflight log)
Evidence path: 4.1 §3 Gap #3, 4.2 Gap #3 (DISPUTED — demoted; Proveo IS the required gate; this is an enhancement)
Why now / Why this owner: verify-fix-loop is a fully built capability sitting idle. Wiring it as a conditional enhancement (not a required gate) improves self-correction for low-risk domains. Proveo owns the verification pipeline.
BlockedBy: MC-STUB-08 (pi-orchestrator must be dispatching for auto-invocation to work reliably; in the interim, a manual invocation pattern is acceptable)

Section 2 — Sequencing Graph

Wave A — Immediate, S effort, high leverage (ship first)

These are unblocked today. Combined effort: ~6h. No CEO decisions needed to START.

MC-STUB-01 (RAG drain-worker credential fix)
    |
    +---> MC-STUB-03 (Live queue depth monitor)  [depends on 01 being live]

MC-STUB-04 (Restore 5 dead-script plists) [sub-task: pi-orch-health blocked on STUB-02]
MC-STUB-09 (Chroma/mem0 orphan audit)     [parallel, no deps]
MC-STUB-10 (B2 storage cap raise)          [parallel, no deps — billing action]

Wave A ships: 01, 03, 09, 10 (immediately); 04 partially (4 of 5 plists — pi-orch-health blocked on STUB-02).

Wave B — After Wave A + CEO decisions

These depend on an architectural decision or on Wave A completing.

MC-STUB-02 (Canonical dispatch path decision)
    |
    +---> MC-STUB-04 [remainder: pi-orch-health script restoration]
    |
    +---> MC-STUB-08 (Restore pi-orchestrator dispatch — actual kernel fix)
    |         |
    |         +---> MC-STUB-12 (wire verify-fix-loop — optional enhancement, needs dispatch working)
    |
    +---> MC-STUB-06 (Routing policy decision + specialist-mapping update)
              |
              +---> MC-STUB-07 (Register Axiom/Datavera/Resolver or archive them)

MC-STUB-05 (Blueprint score gate enforce)  [needs CEO score floor decision — otherwise ship at 60]

CEO decision trigger: before MC-STUB-02 can produce a useful output, the CEO must make one call (see Section 4 item #1).

Wave C — Cleanup / hygiene (non-urgent)

No blocking dependencies. Run when bandwidth allows.

MC-STUB-09 --> MC-STUB-11 (memory-plane doc — safe to write after Chroma state is known)
MC-STUB-12  [verify-fix-loop wiring — Wave C because Wave B must stabilize dispatch first]

Full DAG (text form)

[NOW]
  STUB-01 (RAG creds)  ─────────────────────> STUB-03 (queue monitor)
  STUB-04 partial (4 plists)
  STUB-09 (Chroma/mem0 audit)  ──────────────> STUB-11 (memory doc)
  STUB-10 (B2 billing)

[CEO DECISION on dispatch path]
  STUB-02 (canonical dispatch decision)
    ├──> STUB-04 remainder (pi-orch-health)
    ├──> STUB-08 (pi-orch restore)  ──────────> STUB-12 (verify-fix-loop wire)
    └──> STUB-06 (routing policy)  ──────────> STUB-07 (3 phantom companies)

[CEO DECISION on score floor]
  STUB-05 (blueprint gate enforce)

Section 3 — Out of Backlog (and Why)

DISMISSED gaps — not a fix

mem0 SoR wire break (original Gap #4): Not a break. .md + LightRAG is the actual working design — Claude Code writes .md natively; lightrag-auto-ingest.sh routes .md writes to LightRAG. mem0 was a prototype that was never wired into the active pipeline. CLAUDE.md has zero mention of mem0 as SoR. The correct response is NOT to wire mem0 back — it is to document the actual design (see MC-STUB-11, a documentation-only stub).

verify-fix-loop "unwired" structural gap (original Gap #3): Framing was misleading. CLAUDE.md Hard Constraint #4 requires Proveo verification — and Proveo IS wired and called by /task-postflight. verify-fix-loop is an optional enhancement for docs/system/refactor domains, not the required gate. Adding it is a feature improvement (see MC-STUB-12, demoted to Wave C), not a structural fix.

DEMOTED gaps — lighter scope than original claim

4 phantom companies (original Gap #7 — scope confirmed at 4, not demoted): All 4 companies (Axiom, Datavera, Resolver, Lexicon) are absent from specialist-mapping.json. None are phantom in the sense of missing directories — all have full persona directories — but none are routable via the normal John → discover.js flow. The fix is: inventory work products, then register OR mark as experimental. Addressed in MC-STUB-07 at L priority (documentation + optional routing).

Verifier loop (original Gap #3 — demoted from H to L): Retained as MC-STUB-12 but explicitly classified Wave C, marked as optional enhancement not structural fix. Proveo is the real gate and it is working.

Section 4 — CEO Decision Items

These are blocking decisions that no engineer can make unilaterally. They gate specific MCs.

Decision 1 (CRITICAL — gates MC-STUB-02, 04, 08): Canonical dispatch path

The question: Is durable-runner (port 3052, 20d uptime, stable) the canonical dispatch layer — with pi-orchestrator HTTP (port 8401, dead) being an old control plane that can be decommissioned? OR is pi-orchestrator HTTP supposed to be online, and its deadness is a regression that must be fixed?

Why only CEO can decide: This is an architectural fork. If durable-runner is canonical, FIX is: document it, verify it's processing tasks, and decommission the old HTTP endpoint. If pi-orch HTTP is canonical, FIX is: diagnose startup gating (likely an initialization hang on Ollama or a flag file), restore it, and ensure durable-runner is correctly subordinate.

Options:

A. durable-runner is canonical dispatcher. pi-orch HTTP is legacy. Document this, decommission port 8401.
B. pi-orch HTTP is canonical. Diagnose and restore it. durable-runner is subordinate.
C. Both should be operational. Hybrid model (requires Petter to specify the interaction model).

Decision 2 (M — gates MC-STUB-05): Blueprint score gate floor

The question: What is the enforced minimum score for dispatching a task through Mehanik gate?

Context: Observed practice allows dispatch at score 65 (WARN range). Original spec says 90 is the floor. The gate code currently treats WARN as pass-through. The correct floor must be chosen and hardcoded.

Options:

A. Lower floor to 60 — match observed practice; WARN is acceptable.
B. Floor stays at 90 — WARN becomes BLOCK; blueprints must be updated to score higher.
C. Introduce tiered floors: 60 for L tasks, 75 for M, 90 for H+.

Decision 3 (M — gates MC-STUB-06, 07): Specialist-mapping.json scope policy

The question: Should specialist-mapping.json be comprehensive (cover all 66 agents, all 12 companies) — or curated (cover only primary dispatch paths, leaving internal/helper agents out)?

Why it matters: validator and distiller have 44 and 21 skill references respectively, but may be internal-only agents (called from other agents, not from John). If they're internal-only, they must NOT be in the routing table — they should be in the agent definition files only. If they ARE routable by John, they must be added.

Options:

A. Curated: only John-dispatchable agents enter the routing table. Internal agents documented separately.
B. Comprehensive: all agents mapped; entry type field distinguishes dispatch-routable from internal.

Decision 4 (L — informs MC-STUB-09, 11): mem0 future role

The question: What is mem0's long-term status?

Context: 865 stale facts in mem0_john. Zero active writers. .md + LightRAG is the working pipeline. mem0 server is running and consuming resources.

Options:

A. Deprecate: stop mem0 server; archive its Qdrant vectors; remove from settings.json.
B. Keep as parallel experimental sandbox: document it as optional enrichment layer, not canonical.
C. Promote: wire a PostToolUse hook that writes every .md memory update to mem0 simultaneously (highest effort, not recommended).

Petter's recommendation: Option A (deprecate). The .md pipeline is working. mem0 is cognitive overhead with no active consumer.

Report produced by Petter Graff — CodeCraft Lead Architect Source: 4.1-petter-synthesis.md, 4.2-devils-advocate.md Audit date: 2026-05-09 MC stubs: 12 total. CEO selects 1-3 per session from top of each wave.

Validation Reports

5.1 — Proveo Validation Report

AI Factory Audit — Plan Task 5.1 Validator: Angie Jones (Proveo) Date: 2026-05-09 Audit deliverables reviewed: p1/{1.1,1.2,1.3,1.4}, p2/{2.1,2.2,2.3}, p3/3.1-health-matrix.md, p4/{4.1,4.2,4.3}

Section 1 — Probe Re-Run (10% sample of 17 health-matrix rows)

Five probes selected to cover memory (A1), dispatch (C1), RAG (H1), daemon (D1 verifier), and HiveDB (A3).

Probe 1 — mem0 health endpoint (maps to P3.1 row A1)

Original claim (P3.1 A1): mem0 PARTIAL — write acknowledged, semantic search returns count:1 but results:[] for new user_id audit-test.

Fresh probe:

curl -s http://localhost:9000/health

Output:

{"status": "healthy", "backend": "qdrant", "llm": "qwen3:8b-q8_0@ollama",
 "embedder": "bge-m3@ollama",
 "collections": ["mem0migrations","sessions","hivemind","mem0_john","knowledge"],
 "mem0_collection": "mem0_john"}

Verdict: REPRODUCED

mem0 health endpoint returns status: healthy as stated. Qdrant backend and collections list match the P3.1 evidence. The health plane is intact. The partial-retrieval issue noted in P3.1 (write-acknowledged, empty results for new user_id) is consistent with the collections list — audit-test user would not have a named collection in the list above, confirming P3.1's hypothesis about namespace creation lag.

Probe 2 — HiveDB intel count (maps to P3.1 row A3)

Original claim (P3.1 A3): sqlite3 ~/system/databases/hivemind.db "SELECT COUNT(*) FROM intel;" → 17560, latest entries dated 2026-05-09.

Fresh probe:

sqlite3 ~/system/agents/hivemind/hivemind.db "SELECT COUNT(*) FROM intel;"

Output: 17569

Verdict: REPRODUCED (with expected drift)

Count at probe time is 17,569 — 9 rows above the 17,560 from P3.1. This is a live write-active store; 9 new intel rows in the intervening period is consistent with normal HiveMind alert traffic. P3.1's claim that the store is live and functional is confirmed. The P3.1 "Surprises" note (HiveDB read API exists — P1 claim of "no read API" is wrong) stands confirmed.

Probe 3 — pi-orchestrator PID 75750 alive (maps to P3.1 row C1)

Original claim (P3.1 C1): PID 75750 running since Fri 12pm; curl http://localhost:8401/health → CONNECTION REFUSED.

Fresh probe:

ps aux | grep pi-orchestrator | grep -v grep

Output:

makinja  75750  0.0  0.1 436177552  61728  ??  S  fre.12p.m.  0:22.29
  /opt/homebrew/bin/node /Users/makinja/system/kernel/pi-orchestrator.js start

Verdict: REPRODUCED

PID 75750 is identical — same process, same start time (Friday 12pm), same command. The process has not been restarted, crashed, or replaced since P3.1 was written. This confirms the pi-orchestrator is running but its internal HTTP listener never came up. P3.1's "PARTIAL" verdict is correct: process alive, control plane dead.

Additional validation: confirmed no port 8401 listener and no verify-fix-loop invocation in kernel or hooks (zero grep hits in ~/system/kernel/pi-orchestrator.js and ~/system/hooks/).

Probe 4 — RAG queue depth (maps to P3.1 row H1)

Original claim (P3.1 H1): cat ~/system/state/rag-drain.prom → total 454 (bookstack:442, evidence:2, mc-outcomes:9, specs:1). File mtime 2026-04-23 17:59 (16 days stale). rag-drain-worker crashed today (exit 256, HiveMind alert #64900).

Fresh probe:

cat ~/system/state/rag-drain.prom
stat -f "%Sm %N" ~/system/state/rag-drain.prom

Output:

alai_ingest_queue_depth{source="bookstack"} 442
alai_ingest_queue_depth{source="evidence"} 2
alai_ingest_queue_depth{source="mc-outcomes"} 9
alai_ingest_queue_depth{source="specs"} 1
alai_ingest_queue_depth_total 454

mtime: Apr 23 17:59:36 2026

Verdict: REPRODUCED

Queue values are byte-for-byte identical (bookstack:442, evidence:2, mc-outcomes:9, specs:1, total:454). File mtime is unchanged at 2026-04-23 17:59:36 — no write has occurred since P3.1 was produced. This confirms the drain-worker remains down and the metric is still frozen. The rag-drain-worker is not recovering on its own. P3.1's "PARTIAL" classification and the 16-days-stale caveat are both accurate.

Note on P1 discrepancy: P3.1 states "P1 claim of 946 appears to be an older snapshot." This is confirmed — 946 does not appear in the current prom file at any level. P1 used a superseded snapshot.

Probe 5 — verify-fix-loop auto-invocation (maps to P3.1 row D1)

Original claim (P3.1 D1): Skill exists at ~/.claude/skills/verify-fix-loop/SKILL.md. Manual-trigger only. No daemon or hook auto-invokes it. P2 verdict "ABSENT" partially wrong — capability exists but auto-invocation is absent.

Fresh probe:

grep -rn "verify-fix-loop" ~/.claude/skills/task-postflight/
grep -rn "verify.fix.loop" ~/system/kernel/pi-orchestrator.js
grep -rn "verify.fix.loop" ~/system/hooks/

Output: All three commands return no output (zero matches).

Confirmed skill exists at ~/.claude/skills/verify-fix-loop/SKILL.md (direct ls confirmed). No reference to verify-fix-loop in task-postflight SKILL.md, pi-orchestrator kernel, or hooks directory.

Verdict: REPRODUCED

P3.1's nuanced verdict is correct: the skill exists and is indexed, but no automated trigger references it. task-postflight does not call it. The pi-orchestrator kernel (.js, not the .bak) has zero references. The hooks directory has zero references. P2's "ABSENT" framing was imprecise — P3.1's correction ("skill exists as MANUAL-trigger, not auto-invoked") is the accurate characterization.

Section 1 Summary

Probe	P3.1 Claim	This Probe	Verdict
mem0 health	PARTIAL — healthy endpoint, retrieval gap for new users	Confirmed healthy, collection list consistent with partial behavior	REPRODUCED
HiveDB count	WORKS — 17,560, live writes today	17,569 (+9 rows — normal drift)	REPRODUCED
pi-orch PID 75750	PARTIAL — process alive, HTTP port 8401 dead	Same PID, same uptime, still no port 8401 listener	REPRODUCED
RAG queue depth	PARTIAL — 454 frozen, 16d stale, drain-worker down	Identical values, identical mtime, no recovery	REPRODUCED
verify-fix-loop	PARTIAL — skill exists, zero auto-invocation wiring	Zero hits in task-postflight, kernel, hooks	REPRODUCED

All 5 probes: REPRODUCED. No contradictions to P3.1 found.

Section 2 — MC Stub AC Quality Check (all 12 stubs from 4.3)

Criteria applied per each stub:

AC checklist exists (binary)
Each AC is machine-checkable (not vague)
Effort estimate reasonable
Owner-company makes sense

MC-STUB-01: Restore RAG drain-worker — PASS

AC checklist: YES (5 ACs) Machine-checkable: All 5 are concrete commands with observable exit codes or file stats.

cat /tmp/bw-session exits 0 — checkable
curl -s http://localhost:9621/health returns {"status":"healthy"} — checkable
launchctl list | grep rag-drain-worker LastExitStatus = 0 — checkable
stat ~/system/state/rag-drain.prom mtime within 10 min — checkable
Live queue depth written to new artifact — checkable (file-exists + key-present)

One minor note: the 5th AC references "MC-STUB-03 new artifact" (rag-drain-live.json). This creates a dependency coupling between two stubs' ACs. If MC-STUB-03 is not executed, AC#5 cannot be verified. This is documented in the sequencing graph, but the AC should note the dependency explicitly. Keeping as PASS but noting this coupling.

Effort S (≤2h): Reasonable for a credential session fix + daemon restart. Owner FlowForge: Correct — daemon lifecycle + credential management.

MC-STUB-02: Resolve canonical dispatch path — PASS

AC checklist: YES (4 ACs with conditional branches) Machine-checkable: The branching structure ("IF pi-orch is canonical: curl 200 / IF durable-runner is canonical: grep dispatch log") is valid. Both branches are machine-checkable. The fourth AC ("no dispatch logs older than 2026-04-01 are the NEWEST entry") is checkable via tail -1 on the log file.

Effort L (≤2d): Reasonable — architectural decision + documentation + live probes. This is design work, not a one-line fix. Owner CodeCraft: Correct — kernel architecture is CodeCraft's domain.

MC-STUB-03: Live RAG queue depth monitoring — PASS

AC checklist: YES (4 ACs) Machine-checkable:

rag-drain-live.json exists with queue_depth key — checkable
mtime within 5 min — checkable
launchctl list | grep rag-queue-monitor LastExitStatus = 0 — checkable
HiveMind query returns row within last 1h — checkable

Effort M (≤8h): Reasonable for a new monitoring daemon. Owner FlowForge: Correct. BlockedBy MC-STUB-01 is accurate and documented.

MC-STUB-04: Restore or unload 5 deleted-script plists — WEAK

AC checklist: YES (4 ACs) Machine-checkable: The OR-condition in AC#1 (launchctl list shows ZERO entries OR LastExitStatus=0) is structurally ambiguous for a verifier. A verifier running this check cannot determine which branch was executed without additional context. The check passes in both the "unloaded" and "restored" outcome — which means a verifier cannot distinguish a complete success (restored + healthy) from a partial success (unloaded but not restored). This requires a separate assertion per plist that declares intent.

AC#3 ("Zero exit-127 entries within 24h") uses a 24h observation window — this is time-bound and cannot be machine-checked at point-in-time without log inspection. Recommend: check last 5 launchctl exit codes for each daemon name, not a 24h window.

Effort S (≤2h): Reasonable for an unload/restore task. Owner FlowForge: Correct. Specific fix needed: Split "unloaded" vs "restored" into separate ACs per plist.

MC-STUB-05: Enforce blueprint score gate — PASS

AC checklist: YES (4 ACs) Machine-checkable:

grep -n "WARN\|warn" no bypass path — checkable
Test run with score 65 exits non-zero — checkable (behavioral test)
Test run without MC-ID exits non-zero — checkable
grep "SCORE_FLOOR" returns numeric value — checkable

The behavioral test ACs (#2 and #3) require a test harness that can invoke the gate with a mock blueprint. This is more complex than a read-only probe but is legitimately machine-checkable via a scripted invocation. Acceptable.

Effort S (≤2h): Reasonable for a shell script edit + test run. Owner CodeCraft: Correct for gate scripting.

MC-STUB-06: Agent fleet routing update — WEAK

AC checklist: YES (4 ACs) Machine-checkable concern: AC#3 (node ~/system/tools/discover.js routing "validate acceptance criteria") and AC#4 (node ~/system/tools/discover.js routing "distill text") test routing of "validate" and "distill" — but the stub is about adding validator and distiller agents. The query phrases "validate acceptance criteria" and "distill text" may not match the agent names if discover.js uses keyword matching. A query returning "non-empty result" could be satisfied by a different agent (e.g., Proveo for "validate"), making the AC a false PASS. The AC should check that the returned company/agent specifically includes the newly added entry.

AC#4 (grep -c '"company"' specialist-mapping.json >= previous count + new entries): requires knowing the pre-fix count to evaluate post-fix. This is process-dependent and not self-contained.

Effort M (≤8h): Reasonable — design decision + JSON data entry. Owner CodeCraft + Resolver: Correct.

MC-STUB-07: Register or archive Axiom/Datavera/Resolver — PASS

AC checklist: YES (3 ACs) Machine-checkable:

Each of the three appears in specialist-mapping.json OR has STATUS field in company.json — checkable
discover.js routing "axiom" returns result or explicit message — checkable
No persona directory has unresolved routing status — checkable via scan

Effort M (≤4h): Reasonable for 3-company inventory + status update. Owner CodeCraft: Correct.

MC-STUB-08: Restore pi-orchestrator dispatch — WEAK

AC checklist: YES (4 ACs with conditional branches) Machine-checkable concern: AC#2 (durable-runner branch) states "node ~/system/tools/mc.js list --status ready --limit 1 followed by 5 min wait shows the task state has changed." This is a time-dependent behavioral assertion — a verifier cannot execute a 5-minute wait within a standard probe run. More critically: the state change depends on there being a ready task AND the dispatcher picking it up, which may not be true in a low-traffic environment. This AC can produce false FAILs in idle periods.

AC#4 ("no task with status 'ready' sits unprocessed for more than 30 min in an idle queue — monitored via cron probe") is not a point-in-time checkable assertion. "Monitored via cron probe" means the AC requires an ongoing monitoring setup, not a single verification pass.

Effort L (≤2d): Reasonable — kernel-level architectural work. Owner CodeCraft: Correct. BlockedBy MC-STUB-02: Documented and accurate.

MC-STUB-09: Audit and archive Chroma + stale mem0 — PASS

AC checklist: YES (4 ACs) Machine-checkable:

curl localhost:8000/api/v1/collections returns documented list OR connection refused — checkable
If decommissioned: entry removed from settings.json — checkable
curl localhost:9000/v1/memories/?user_id=john — checkable
memory-plane-canonical.md exists — checkable

Effort S (≤2h): Reasonable — mostly audit + file/config edit. Owner CodeCraft: Acceptable. Could also be FlowForge (infra cleanup), but CodeCraft is defensible given the architectural documentation artifact.

MC-STUB-10: Raise B2 storage cap + litestream health — WEAK

AC checklist: YES (4 ACs) Machine-checkable concern: AC#1 uses curl -s -H "Authorization: applicationKey:..." https://api.backblazeb2.com/b2api/v2/b2_get_bucket_info. The authorization string is a placeholder — a verifier running this command verbatim will get a 401. The AC must reference the credential lookup method (e.g., bw get item "backblaze-b2-key" --session $(cat /tmp/bw-session)) rather than a literal placeholder. This is an evidence-fabrication risk: a lazy verifier could claim PASS without actually having the credentials.

AC#3 (grep "$(date +%Y-%m-%d)" ~/system/logs/litestream.log | tail -1): requires the litestream log file to exist and be written today. If the log path differs from what's specified, this is a silent FAIL. The AC should include a fallback check for log file existence first.

Effort S (≤2h): Reasonable — billing console action + log verification. Owner FlowForge: Correct.

MC-STUB-11: Document memory pipeline (doc-only) — PASS

AC checklist: YES (4 ACs) Machine-checkable:

memory-plane-canonical.md exists — checkable
CLAUDE.md contains specific phrase — checkable via grep
BookStack page exists — checkable via curl
mem0 status documented as "sandbox/experimental" — checkable via grep in spec

Effort M (≤4h): Reasonable for a doc task. Owner Skillforge: Correct. BlockedBy MC-STUB-09: Documented and logical.

MC-STUB-12: Wire verify-fix-loop (Wave C enhancement) — WEAK

AC checklist: YES (4 ACs) Machine-checkable concern: AC#3 states "A dry-run of /task-postflight on a docs-domain MC shows verify-fix-loop invoked (not just Proveo)." This requires: (a) a real MC in docs domain to exist, (b) /task-postflight to be invokable in dry-run mode. The stub does not specify whether task-postflight has a --dry-run flag or how to interpret its output to confirm verify-fix-loop was called vs not called. Without a defined output artifact or log to inspect, this AC is not fully machine-checkable.

AC#4 ("verify-fix-loop invocation does NOT replace Proveo — both must appear in the postflight log") is checkable IF the log artifact is defined. Currently "postflight log" is unspecified in the AC — what file path, what format?

Effort M (≤8h): Reasonable. Owner Proveo: Correct — this is Proveo's enhancement of the verification pipeline. BlockedBy MC-STUB-08: Documented. Logical since auto-invocation requires dispatch to work.

Section 2 Summary

Stub	Score	Key Reason
MC-STUB-01	PASS	All 5 ACs concrete and checkable; minor cross-stub dependency coupling noted
MC-STUB-02	PASS	Conditional branch structure is valid; both branches machine-checkable
MC-STUB-03	PASS	All 4 ACs concrete; mtime + launchctl + HiveMind query all verifiable
MC-STUB-04	WEAK	OR-condition in AC#1 prevents distinguishing unload from restore; 24h window not point-checkable
MC-STUB-05	PASS	Behavioral test ACs are valid given scripted invocation harness
MC-STUB-06	WEAK	discover.js routing query may return false PASS from a different agent; count diff AC not self-contained
MC-STUB-07	PASS	All 3 ACs are direct file/command checks
MC-STUB-08	WEAK	5-min wait AC and 30-min cron-monitoring AC not point-in-time checkable
MC-STUB-09	PASS	All 4 ACs concrete; connection-refused is an explicit acceptable output
MC-STUB-10	WEAK	Authorization placeholder in AC#1 is evidence-fabrication risk; log path not verified to exist
MC-STUB-11	PASS	All 4 ACs are grep/curl/file-exist checks
MC-STUB-12	WEAK	dry-run invocation mechanism undefined; "postflight log" file path unspecified

PASS: 7 stubs | WEAK: 5 stubs | FAIL: 0 stubs

5 WEAK stubs require AC refinement before dispatch. None are structurally broken — all have correct intent, fixable in ≤30 min each.

Section 3 — Cross-Report Consistency

Finding 3.1: P4.1 mem0 vector count conflicts with P3.1 detail

P4.1 Section 2 (Delta Table, Memory plane row): States "mem0 API has 0 active writers, 865 stale facts." P4.1 Section 4 (Architectural Conclusions): States "mem0/Qdrant (93K+ vectors, zero active writers)."

These two numbers — 865 facts and 93K+ vectors — are not reconciled within P4.1. 865 is the mem0 fact count (application-layer). 93K+ would be the raw Qdrant vector count across all collections (embedding-layer, where each fact generates multiple vectors). P4.1 uses both without clarifying this distinction, creating an apparent contradiction. P3.1 does not cite either figure directly. The delta table figure (865) is more precise and correct as stated; the architectural narrative (93K+) needs a qualifier ("93K+ raw Qdrant embeddings across all collections, including non-mem0 collections such as HiveMind and knowledge").

Severity: LOW — confusing but not misleading about the fix needed.

Finding 3.2: P4.3 references a DISMISSED gap (Gap #3 = verifier loop) via MC-STUB-12

P4.2 Gap #3 verdict: "DISPUTED — demoted." P4.2 concludes the gap framing was misleading and recommends relegating to Wave C enhancement. P4.3 Section 3 (Out of Backlog): Correctly identifies Gap #3 as DEMOTED (not dismissed). MC-STUB-12 is retained in the backlog as a Wave C item with L priority.

This is NOT a contradiction — it is correctly handled. P4.3's "Out of Backlog" section explicitly distinguishes DISMISSED (Gap #4 mem0 SoR) from DEMOTED (Gap #3 verifier loop). The sequencing graph correctly places MC-STUB-12 in Wave C. Consistent.

Finding 3.3: P4.3 MC-STUB-04 claims pi-orch-health plist references `pi-orch-health.sh` — P3.1 G1 says daemon state is "not running"

P3.1 G1: launchctl print gui/501/com.alai.pi-orch-health → state: not running. Last health report Verdict: CRITICAL (2026-05-06). Scheduled health monitor failing. P4.3 MC-STUB-04: "pi-orch-health.sh was deleted on 2026-05-06 when the last recorded status was CRITICAL."

These are consistent — daemon not running because script was deleted (exit 127 pattern from P1.4). No conflict.

Finding 3.4: P2.1 connectivity diagram "Dead Edge 1" vs P3.1 C1/C2 — minor framing gap

P2.1 (per P4.2 citation): labels the pi-orchestrator → agent dispatch path as "Dead Edge 1" and characterizes pi-orch as "MOCK MODE." P3.1 C2: Explicitly finds NO mock config reference in the kernel (grep "mock" → zero matches). Config shows offlineMode: false, enabled: true. P4.2 rebuttal: Confirms P3.1 is correct — "MOCK MODE" framing is inaccurate; the real issue is HTTP port 8401 startup gating.

Status: P2.1 uses "MOCK MODE" language that P3.1 and P4.2 both correct. P4.1 repeats "mock/broken mod" in the executive summary. P4.3 avoids this language entirely (describes the gap as "HTTP port dead" and "no dispatch logs post-March"). The P4.1 executive summary should be updated to drop "mock mode" — it is an inaccurate framing that has been rebutted by P3.1 probe evidence.

Severity: LOW-MEDIUM — the corrected framing matters for how the CEO frames the fix. "Mock mode" implies intentional test configuration; "HTTP startup gating failure" implies a recoverable initialization bug.

Finding 3.5: P4.1 Gap #5 composite score vs P4.3 MC-STUB-06 composite score — mismatch

P4.1 Gap #5 (Agent routing table incomplete): Composite = 28 (7 × 8 / 2). P4.3 MC-STUB-06 (Design decision + routing update): Composite = 18 (7 × 5 / 2), "post-rebuttal adjusted."

The severity was reduced from 8 to 5 after the devil's advocate review. P4.3 explicitly notes "post-rebuttal adjusted." This is correct — the rebuttal demoted this gap when it found that validator/distiller may be internal-only agents. The composite score difference is intentional and documented, not an error.

Status: Consistent — change is intentional and documented.

Finding 3.6: P4.1 Gap #7 cites "4 phantom companies" — P4.2 + P4.3 correct to 3

P4.1 Gap #7: "4 companies (Axiom, Datavera, Resolver, Lexicon) have full persona dirs... but zero entries in specialist-mapping.json." P4.2 Gap #7 rebuttal: Confirmed Lexicon IS in specialist-mapping.json. Only 3 companies are unroutable. P4.3 MC-STUB-07: Scope correctly adjusted to "Axiom, Datavera, Resolver" (3 companies).

The correction flows correctly through the document chain. P4.1 contains the uncorrected claim (4 companies); P4.2 rebuttal catches it; P4.3 backlog uses the corrected count. This is the intended flow. However, P4.1 should carry a note that its Gap #7 count was revised to 3 by P4.2. As-is, a reader of P4.1 alone gets the wrong number.

Severity: LOW — the correction exists in P4.2 and P4.3; only P4.1 isolation readers are misled.

Section 3 Summary

Finding	Reports Affected	Severity	Status
3.1 — mem0 865 facts vs 93K+ vectors unclarified	P4.1 internal	LOW	Minor annotation needed in P4.1 architectural section
3.2 — Dismissed vs Demoted gap classification	P4.2 → P4.3	NONE	Correctly handled
3.3 — pi-orch-health plist consistency	P3.1 ↔ P4.3	NONE	Consistent
3.4 — "Mock mode" framing rebutted but survives in P4.1 summary	P2.1 → P4.1	LOW-MEDIUM	P4.1 executive summary should replace "mock/broken mod" with "HTTP startup gating failure"
3.5 — Composite score change Gap #5 → STUB-06	P4.1 ↔ P4.3	NONE	Intentional, documented
3.6 — "4 phantom companies" in P4.1 vs corrected "3" in P4.3	P4.1 ↔ P4.3	LOW	P4.1 needs a correction note; P4.3 is correct

No blocking contradictions found. Three low-severity annotation gaps noted.

Section 4 — Final Verdict

Verdict: REWORK (minor)

The audit deliverables are substantially sound. All 5 re-run probes reproduced P3.1 findings. The fix backlog is correctly prioritized and the sequencing DAG is architecturally coherent. CEO can act on the Wave A items immediately.

However, two categories of rework are required before CEO consumption of the full backlog:

Category A — AC refinement (5 stubs, ≤30 min each):

MC-STUB-04: Split the "unloaded OR restored" OR-condition into separate per-plist ACs; replace 24h window with last-N-exit-code check.
MC-STUB-06: Rewrite the discover.js routing ACs to assert the specific agent returned (not just "non-empty result"); make count-diff AC self-contained with an explicit pre-fix baseline command.
MC-STUB-08: Replace the 5-min-wait behavioral AC with a point-in-time dispatch log check (e.g., log entry exists with today's date). Replace the 30-min cron-monitoring AC with a statement that a cron probe must be set up as a child task.
MC-STUB-10: Replace the literal Authorization: applicationKey:... placeholder with a credential retrieval command (bw get item ...); add a log-file existence pre-check before the grep assertion.
MC-STUB-12: Define the "postflight log" artifact path; specify whether task-postflight has a --dry-run invocation mode or define an alternative observable output.

Category B — Annotation fixes in P4.1 (≤15 min):

P4.1 executive summary: Replace "mock/broken mod" for pi-orchestrator with "HTTP port startup gating failure" to match P3.1 and P4.2 corrected findings.
P4.1 Gap #7: Add a footnote that P4.2 rebuttal revised the affected company count from 4 to 3 (Lexicon confirmed routable).
P4.1 architectural section: Clarify that "93K+ vectors" is the raw Qdrant embedding count across all collections, not the mem0 fact count (865 application-layer facts).

What CEO CAN act on immediately without rework:

Wave A tasks (STUB-01, STUB-03, STUB-09, STUB-10 partial) — their ACs are either PASS-rated or the WEAK issues do not affect Wave A execution.
CEO Decision Items 1-4 in Section 4 of P4.3 — these are architectural choices, not dependent on AC quality.
The overall gap prioritization and sequencing DAG — both are sound.

Evidence dir: /tmp/ai-factory-audit-2026-05-09/p5/ Validated docs: p3/3.1-health-matrix.md (sha256: f4af148add0d8ee7933da370126cbd90c9c024708d39847c35093e7551b1af98) Validated docs: p4/4.3-fix-backlog.md (sha256: 48c4728559d9fe307d067e63fc7ccd3c3c68b83a56801e52aa65b565d630b307)

Produced by Angie Jones — Proveo 2026-05-09

Atomic-Claim Verification — AI Factory Audit Synthesis

Verifier: Verifier Agent (read-only) Date: 2026-05-09 Source verified: 4.1-petter-synthesis.md CLAIMS_SOURCE: spec:/tmp/ai-factory-audit-2026-05-09/p4/4.1-petter-synthesis.md

Atoms (one per claim)

Probe: Count LIVE / DEAD / PARTIAL from edge table in 2.1-connectivity-diagram.md Section E

Output:

Total edges inventoried: 40
LIVE:    15
DEAD:    15
PARTIAL: 10
DEAD + PARTIAL = 25 / 40 = 62.5%
(confirmed by 2.1 Summary Statistics table: "The factory has a 37.5% live edge rate.")

Verdict: PASS
Note: Math is exact. 25 dead or degraded edges out of 40 = 62.5%. The edge table in 2.1 is the audit's own source of truth; Petter's synthesis correctly reports its own source document.

A2: "All actual dispatch is manual-John"

Probe: grep -l "verify-fix-loop\|auto.dispatch\|Task(" ~/.claude/hooks/*.sh → no matches. launchctl list | grep "durable\|pi-orch" → pi-orchestrator PID 75750 running, durable-runner (orchestrator-bridge) PID 1185 running. tail -5 ~/system/logs/pi-orchestrator/daemon-stdout.log

Output:

[2026-05-09T19:31:19.216Z] [INFO] Starting PI orchestrator cycle (active: 0)
[2026-05-09T19:31:19.567Z] [DEBUG] No eligible tasks
[2026-05-09T19:31:19.601Z] [INFO] [IDLE] System idle — starting YouTube batch learning
grep "No eligible tasks" → 55,351 matches in daemon-stdout.log
No hook in ~/.claude/hooks/ calls Task() or verify-fix-loop.

Verdict: PASS
Note: The pi-orchestrator is live and cycling every 30s, but prints "No eligible tasks" continuously (55,351 such messages in the log). Port 8401 refuses connections (confirmed: lsof -i :8401 returns nothing). No hook fires auto-dispatch. Manual-John is the actual dispatch path.

A3: "CEO is the de-facto verifier for every task that reaches mc.js ready"

Probe: Read 2.2-verifier-autonomy.md verdict; cross-check P3.1 D1 correction; read CLAUDE.md Hard Constraint #4

Output:

2.2-verifier-autonomy.md: "Autonomy verdict: ABSENT"
P3.1 D1: "SKILL EXISTS at ~/.claude/skills/verify-fix-loop/SKILL.md. Skill is MANUAL-TRIGGER only."
2.2: "CEO is the de-facto verifier for every task that reaches mc.js ready"
4.2 rebuttal: "DISPUTED — Proveo (required gate) IS wired. verify-fix-loop is optional enhancement."
CLAUDE.md Hard Constraint #4: "Builder cannot say done. mc.js ready → Proveo → done."

Verdict: PASS — but with an important qualification
Note: The synthesis headline is accurate in its core claim (no auto-invocation of verify-fix-loop), but the 4.2 devil's advocate correctly shows it overstates the situation. Proveo/Angie Jones IS the mandatory gate and it IS wired via /task-postflight. The CEO-as-verifier pattern holds for tasks where /task-postflight is not invoked (which is itself manual for H tasks only per 2.1 Edge #12: "Manual CLI invocation. H-tasks only"). So the claim is accurate for all tasks that do NOT go through task-postflight, which is the majority. Verdict: PASS with nuance — synthesis is accurate but 4.2's correction is also valid and the synthesis does not incorporate it.

A4: "5 deleted scripts, plists still scheduled"

Probe: Check each script on disk; check each plist in launchctl

Output:

MISSING: pi-orch-health.sh (~/system/tools/)
MISSING: cost-daily-report.sh (~/system/tools/)
MISSING: daily-planning.sh (~/system/tools/)
MISSING: legal-docs-azure-sync.sh (~/system/daemons/)
MISSING: mcp-health-check.sh (~/system/tools/)

launchctl status:
LOADED: com.alai.pi-orch-health        → exit 127
LOADED: com.alai.cost-daily-report     → exit 127
LOADED: com.alai.daily-planning        → exit 127
LOADED: com.john.legal-docs-azure-sync → exit 127
LOADED: com.john.mcp-health-check      → exit 127

Verdict: PASS
Note: All 5 scripts confirmed missing on disk. All 5 plists confirmed loaded in launchctl with exit 127. Petter's claim is exactly correct.

A5: "RAG queue 454 with 16d-stale metric"

Probe: cat ~/system/state/rag-drain.prom (mtime + content); sqlite3 -readonly ~/system/state/ingest-queue.sqlite "SELECT COUNT(*) FROM ingest_queue;"

Output:

rag-drain.prom:
  mtime: 2026-04-23 17:59 (16 days stale — CONFIRMED)
  alai_ingest_queue_depth_total: 454 (this is the stale snapshot)

ingest_queue SQLite (live):
  SELECT COUNT(*) → 3,150 rows total
  bookstack: 1703 + 48 = 1751 (duplicate sources — different status?)
  evidence: 372 + 58 = 430
  mc-outcomes: 44 + 10 + 71 = 125
  specs: 636 + 102 = 738
  rules: 80
  manual: 2

Verdict: FAIL
Note: The "454" figure is from a 16-day-stale prometheus file — that part is accurate. But the live SQLite shows 3,150 queued items, not 454. The actual queue depth is ~7x worse than the synthesis states. The synthesis (following P3.1 H1) correctly flags the staleness of the metric, but then quotes the stale 454 figure as if it is the actual state. The real state is a 3,150-item frozen queue. The synthesis should have noted the true live count or stated "actual count unknown; stale metric shows 454 as lower bound." This is a significant understatement of severity.

A6: Petter's top-3 gaps listed, then fresh-probed

Probe: From synthesis Section 1 "5 najkritičnijih praznina" — top-3 are: (1) RAG ingest pipeline blocked, (2) pi-orchestrator in mock/broken mode, (3) Verifier loop capable but not called. Fresh probe each.

Output:

Gap 1 — RAG ingest pipeline:
  ingest_queue SQLite = 3,150 items (live). drain-worker crashing (HiveMind #64900 exit 256 today).
  LightRAG health: 3.1 A2 shows healthy (curl localhost:9621 → 200). Blocker = Vaultwarden auth.
  STATUS: CONFIRMED AND WORSE THAN STATED (3,150 not 454)

Gap 2 — pi-orchestrator:
  PID 75750 alive. Port 8401: lsof -i :8401 → NOTHING (dead).
  Log tail: "No eligible tasks" — 55,351 occurrences.
  offlineMode reference found in pi-orchestrator.js (5 matches incl. "offlineMode: true" in config).
  Port 3052: lsof -i :3052 → node PID 1185 LISTENING (durable-runner alive).
  launchctl: com.alai.orchestrator-bridge PID 1185, exit 0.
  STATUS: CONFIRMED — HTTP dead, durable-runner live but not dispatching.

Gap 3 — Verifier loop:
  ~/.claude/skills/verify-fix-loop/SKILL.md EXISTS.
  No hook in ~/.claude/hooks/ calls it (grep returns no matches).
  No daemon with verify-fix-loop call found.
  STATUS: CONFIRMED — capability exists, zero auto-invocation.

Verdict: PASS (top-3 gaps confirmed by fresh probes; RAG figure is understated but the gap itself is real)

A7: "37 unmapped agents" vs "42 unmapped agents" — which count is in the synthesis?

Probe: grep "37\|42" 4.1-petter-synthesis.md | grep -i "unmapped\|agent" → no results. Read Section 2 table entry for Agent fleet.

Output:

4.1-petter-synthesis.md Section 2 Agent fleet row:
  "44% mapping coverage (29/66). validator (44 skill refs) and distiller (21 refs) absent
   from mapping. 7 mapped agents unreachable on disk. 4 companies invisible to routing.
   35 chains have no executor."

The synthesis does NOT quote "37 unmapped" or "42 unmapped" as a standalone number.
P1.3 (1.3-agent-fleet.md) explicitly states: "42 unmapped agents" and breaks down to
  11 ORPHAN + 11 DUPLICATE + 20 NEEDS-MAPPING = 42.
The prior "37 unmapped" figure appears in the audit brief question but is NOT in P1.3 text.

Verdict: PASS — the synthesis avoids quoting a specific unmapped count; it uses "44% mapping coverage (29/66)" instead, which is accurate (66 - 29 = 37 unmapped, but P1.3 corrects this to 42 because 7 mapped agents are also missing from disk, so the "reachable" count is lower). The synthesis does not contain the discrepant number — the A7 atom is about consistency, and the synthesis is consistent (it omits the count rather than stating it).
Note: P1.3's 42 figure counts agents in ~/.claude/agents/ not in specialist-mapping.json. The synthesis's choice to use "44%" coverage is the safer framing. No inconsistency to report.

A8: "All 35 chain YAMLs are dead"

Probe: ls ~/system/tools/chain-runner.sh, ls ~/system/tools/chain-runner.js, check if chain-runner is invoked by any daemon or skill

Output:

chain-runner.js EXISTS: ~/system/tools/chain-runner.js (31208 bytes, 2026-02-26)
  Header: "YAML-defined agent chain orchestrator / Runs declarative agent chains 
           defined in ~/system/agents/chains/*.yaml"
  CLI: node chain-runner.js run <chain-name> / resume / list / show

chain-runner.sh EXISTS: ~/system/tools/chain-runner.sh (9281 bytes, 2026-05-07)
  Header: "Pillar #5 stateless skill-chain runner (one step per tick)"
  This is what com.alai.chain-daily-inbox calls.

grep "chain-runner" ~/.claude/skills/ → NO MATCHES (in non-archived skills)
grep "chain-runner" ~/system/daemons/ → NO MATCHES
launchctl: com.alai.chain-daily-inbox (exit 1, not running)
            com.alai.chain-e2e-nightly (exit 1)
            com.alai.chain-phantom-detector (exit 1)

Verdict: FAIL
Note: The synthesis claims "35 chain YAML files without a single executor" but chain-runner.js IS a functional chain executor (31KB, CLI-complete, linked to MC #1902). chain-runner.sh is a second runner (Pillar #5). The 1.3-agent-fleet.md also acknowledges chain-runner.sh exists ("com.alai.chain-daily-inbox: failure likely in downstream chain execution"). The chain-runner EXISTS — it is just (a) currently broken/unused due to downstream failures, and (b) not invoked from any active skill. The claim "no chain runner exists" is factually false; the correct claim is "chain runners exist but are broken or un-invoked." This is a meaningful distinction: fixing chains requires fixing the runners' downstream dependencies, not building a runner from scratch.

A9: "pi-orch HTTP dead but durable-runner port 3052 is the dispatch path"

Probe: lsof -i :8401, lsof -i :3052, launchctl list | grep "durable\|orchestrator"

Output:

lsof -i :8401 → NO OUTPUT (port 8401 not listening — confirmed dead)
lsof -i :3052 → node PID 1185 LISTENING on *:apc-3052
launchctl:
  1185  0  com.alai.orchestrator-bridge    (PID alive, exit 0)
  1212  0  com.john.durable-executor       (PID 1212, exit 0)
  75750 0  com.john.pi-orchestrator        (PID alive, exit 0)
  -     0  com.john.orchestrator-http      (down_exit_0: duplicate)

Verdict: PASS
Note: Port 8401 confirmed dead. Port 3052 confirmed live (node PID 1185, 20-day uptime per P3.1). The synthesis's claim that durable-runner is the active dispatch path is confirmed structurally. However, P3.1 C1 and 4.2 Gap #2 both note that even the durable-runner shows no dispatch activity post-2026-03-19 — the pi-orchestrator log confirms "No eligible tasks" cycling. So "durable-runner is the dispatch path" is confirmed as the structural path, but it is also idle. The synthesis correctly notes dispatch is unclear via this path; 4.2 appropriately flags this ambiguity.

A10: DISMISSED gaps — are they actually dismissable?

Probe: Read 4.2 devils advocate dismissal reasoning for mem0 wire and verify-fix-loop; re-check CLAUDE.md for mem0 SoR designation

Output:

mem0 SoR dismissal (4.2 Gap #4):
  grep -i "mem0" ~/.claude/CLAUDE.md → 0 matches (confirmed by 4.2)
  grep -i "System of Record\|SoR" ~/.claude/CLAUDE.md → 0 matches
  4.2 reasoning: ".md + LightRAG is INTENDED design; mem0 was never designated SoR"
  Evidence: lightrag-auto-ingest.sh hook explicitly routes .md → LightRAG (P1.1)
  Verdict on dismissal: SOUND — mem0 SoR gap is a false positive. CLAUDE.md never
    designated mem0 as SoR. The .md pipeline is the designed path.

verify-fix-loop dismissal (4.2 Gap #3 downgraded to feature request):
  CLAUDE.md Hard Constraint #4: "mc.js ready → Proveo verification → done"
  Proveo IS wired via task-postflight (P2.2 confirms).
  verify-fix-loop is OPTIONAL enhancement, not required gate.
  4.2 reasoning: "The REQUIRED verification gate (Proveo) IS wired and working."
  Verdict on dismissal: SOUND — the required gate exists. CEO-as-verifier claim is
    overstated because Proveo gate IS the designed verifier; it's just H-tasks only
    and manual-invoked (per 2.1 Edge #12 PARTIAL). The dismissal is correct that
    verify-fix-loop is not a gap in required functionality.

Phantom companies dismissal of Lexicon (4.2 Gap #7):
  grep "Lexicon\|lexicon" ~/system/agents/specialist-mapping.json → NO OUTPUT
  This contradicts 4.2's claim that "Lexicon IS in specialist-mapping.json."
  4.2 states: "I found 'company: Lexicon' in the mapping with Dževad Jahić."
  Live grep returns nothing. P1.3 confirms: "skillforge.md maps to 'Skillforge' not Lexicon."
  Verdict: 4.2's Lexicon dismissal ERRS. Lexicon is NOT routable via specialist-mapping.json.
    The 4 phantom companies remain 4, not 3 as 4.2 claims. 4.2 hallucinated a Lexicon entry.

Verdict: PARTIAL FAIL — mem0 and verify-fix-loop dismissals are sound, but the Lexicon phantom-company dismissal is WRONG (4.2 claims Lexicon is mapped; live grep shows it is not).

Confidence Grade

FEEDBACK — Two atoms FAILED with concrete evidence (A5: queue depth understated 454 vs 3,150; A8: chain-runner.js and chain-runner.sh DO exist; A10: Lexicon phantom company dismissal in 4.2 is wrong).

Summary

Atoms passed: 7 / 10
Atoms failed: 3 (A5, A8, A10-Lexicon)
Confidence: FEEDBACK
Feedback file written: /tmp/verifier-feedback-ai-factory-audit.md

AI Factory Audit 2026-05-09

Executive Summary (SENTINEL Final)

SENTINEL AUDIT — Final Consolidated Report

FINAL VERDICT

Headline (Bosnian)

Top-5 Actionable Findings (Post-Corrections)

1. RAG ingest pipeline blocked — 3,150+ items queued (not the stale 454)

2. pi-orchestrator not dispatching — HTTP port 8401 dead since March

3. Verifier loop capability exists but zero auto-invocation

4. Agent routing table incomplete — validator and distiller unmapped (44 references, 21 references, 0 routing entries)

5. Four phantom companies unroutable (Axiom, Datavera, Resolver, Lexicon)

Wave A — Ship Now (No CEO Decisions Needed)

Wave B — Needs CEO Architectural Decisions First

CEO Decision #1 (CRITICAL): Canonical dispatch path

CEO Decision #2 (MEDIUM): Blueprint score gate floor

CEO Decision #3 (MEDIUM): specialist-mapping.json scope policy

CEO Decision #4 (LOW): mem0 future role

Surfaced Contradictions Resolved

Contradiction 1: RAG queue depth — 454 vs 3,150

Contradiction 2: pi-orchestrator "mock mode" vs actual config

Contradiction 3: Chain runner existence

Contradiction 4: Lexicon company phantom status

Contradiction 5: mem0 SoR intent

Contradiction 6: HiveMind read API

Open Questions for CEO

Recommendation

Rework Required Before General Dispatch

Connectivity Diagram

2.1 — AI Factory Connectivity Diagram

Section A — Control Plane Diagram

Section B — Data Plane Diagram

Section C — Agent / Persona / Chain Plane

Section D — The True Picture (CEO-readable, 60 seconds)

Plan vs. Reality

The 3 Fattest Dead Edges

The 3 Highest-Leverage Wire Fixes

Section E — Edge Inventory Table

Summary Statistics

Inventory: Memory Plane

Memory Plane Inventory — AI Factory Audit

1. Per-Store Table

2. Producer → Consumer Matrix

3. SoR Gap Analysis — Duplicated Fact Classes

4. Critical: The .md vs mem0 Wire Break

What was supposed to happen

What actually happens

HiveDB relationship

5. Store Status Summary

Open Questions

Inventory: Tools Shed

Tools Shed Audit — 2026-05-09

Summary

1. Total Counts by Classification

Live Tools (ACTIVE status in manifest or active daemon references)

Backup Files (.bak*, .pre-*, .deployed)

Junk Findings

2. Manifest Drift Analysis

3. Un-owned LIVE Tools

4. Stale .bak Files (>14 days old)

5. Additional Junk & Quality Findings

Missing Expected Files

Suspicious Dead Code

Subdirectories with Nested Tools (Not in Manifest)

6. Top-10 Largest Tools

7. Live References — Tool Coverage

Open Questions

Recommendations (Audit-Level Only)

CRITICAL

HIGH

MEDIUM

Audit Confidence

Inventory: Agent Fleet

Agent Fleet Inventory — SENTINEL Audit 2026-05-09

1. 66 vs 29 vs 12 Reconciliation

Raw counts (tool-verified)

Mapped agents (29 in specialist-mapping.json)

42 unmapped agents (in ~/.claude/agents/ but NOT in specialist-mapping.json)

2. Persona Dirs Deep Dive

3. Chain Coverage

Agents referenced in chains

Backup Files (.bak, .pre-, .deployed)

1. End-to-End Trace of `/verify-fix-loop`