Verifier Autonomy Audit
AI Factory Audit — Plan Task 2.2: Verifier Autonomy
Date: 2026-05-09 Auditor: Martin Kleppmann (CodeCraft) Classification: AUDIT-ONLY — read-only, no mutation, no live invocation
VERDICT SUMMARY (up front)
Autonomy verdict: ABSENT
The /verify-fix-loop skill is fully specified and internally consistent, but it has zero wiring into any automated trigger path. CEO is the de-facto verifier for every task that reaches mc.js ready. The skill exists only as a manually-invoked slash command.
1. End-to-End Trace of /verify-fix-loop
Source: ~/.claude/skills/verify-fix-loop/SKILL.md
Flow map
Caller (John / human) invokes: /verify-fix-loop mc_id=<N> spec_path=<path>
│
▼
SKILL orchestrates in main conversation thread (not a sub-agent itself)
│
├─ mkdir -p /tmp/verify-fix-loop-<mc_id>/ (EVIDENCE_DIR)
│
▼
LOOP (max 3 iterations):
│
├─ Step A: Task(subagent_type=verifier OR general-purpose+persona)
│ prompt = verifier brief template (inline in SKILL.md)
│ verifier writes: EVIDENCE_DIR/verifier-loop<N>.md (mandatory)
│ /tmp/verifier-feedback-<mc_id>.md (if CONFIDENCE=FEEDBACK)
│
├─ Step B: Parse STATUS + CONFIDENCE from verifier output
│
├─ Step C: Branch
│ PERFECT / VERIFIED → write SUMMARY.md (SUCCESS), exit
│ PARTIAL → if high_stakes: ESCALATE; else: SUCCESS_WITH_NOTES, exit
│ FAILED → ESCALATE (harness broken)
│ FEEDBACK:
│ if high_stakes or budget exhausted → ESCALATE
│ else →
│
├─ Step D: Task(subagent_type=fix-builder OR general-purpose+persona)
│ reads /tmp/verifier-feedback-<mc_id>.md
│ applies prescribed edits to spec_path via Edit tool
│ returns APPLIED:<N> / PARTIAL:<N>/<M> / COULD_NOT_APPLY:<reason>
│
└─ LOOP_INDEX += 1 → back to Step A
Domain escalation policy
docs,system,refactor,polish— loops up to MAX_LOOPS (default 3)security,finance,legal,deploy,infra,unknown— ESCALATE on first FEEDBACK (no autonomous correction)
Loop budget
- Default MAX_LOOPS = 3
- Hard cost cap: $5 per skill invocation
- Per-loop cost estimate: $0.40–0.60 (Sonnet)
- Worst case: 3 × $0.60 = $1.80
Termination conditions
- CONFIDENCE in {PERFECT, VERIFIED} → SUCCESS
- CONFIDENCE == PARTIAL + not high_stakes → SUCCESS_WITH_NOTES
- Budget exhausted (LOOP_INDEX == MAX_LOOPS with FEEDBACK) → ESCALATE
- High-stakes domain with FEEDBACK on first iteration → ESCALATE
- Any FAILED confidence → ESCALATE (harness broken)
- fix-builder returns COULD_NOT_APPLY → ESCALATE
- MC status changes to done/cancelled mid-loop → ABORT silently
- Cost estimate exceeds $5 → ESCALATE before next iter
Entry points (who can call this)
The SKILL.md lists trigger phrases: "verify-fix-loop", "auto-verify and fix", "verifier loop", "ne idi preko mene", "loop until pass". All trigger phrases are designed for human invocation in a conversation. No programmatic entry points exist.
2. Auto-Invocation Analysis — The Central CEO Question
pi-orchestrator.js
Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in
~/system/kernel/pi-orchestrator.js.
The orchestrator's post-completion flow (reportCompletion function, lines ~3781–3930) does:
- Hallucination detection (regex-based
detectHallucination) - Proof-of-work check (GOTCHA file or response length)
- qa-19 Check #20 (endpoint verification, if configured)
- Postflight marker write to
~/system/state/postflight-cleared-<id>.json
None of these steps call the verifier, fix-builder, or verify-fix-loop skill.
The "postflight" referenced in pi-orchestrator is a file marker write, NOT the /task-postflight skill.
task-postflight skill
Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in
~/.claude/skills/task-postflight/SKILL.md.
The /task-postflight skill dispatches Angie Jones (Proveo) for AC-checklist QA, not the atomic-claim verifier. These are parallel, non-overlapping verification patterns:
- Proveo = human-readable AC checklist with pass/fail verdicts per item
- Verifier = atomic claim decomposition with machine-verified proof citations
Hooks directory
Grep result: Only archive files matched. No active hook in ~/.claude/hooks/ references verify-fix-loop, verifier, or fix-builder.
Active hooks audited:
liveness-claim-validator.sh— PostToolUse on Write/Edit; checks for bare liveness claims in memory/spec/agent files. Not related to verifier dispatch.mc-ready-gate.sh— wrapper formc.js ready; runs ZAKON #30 direct-probe gate + evidence-contract-validator. Does NOT invoke verify-fix-loop.evidence-contract-validator.sh— validates verdict JSON schema + sha256 chain. Shell-based, no agent dispatch.cross-session-claim-gate.sh,session-task-lock-gate.sh,plan-completeness-gate.sh,pre-dispatch-gate.sh— none reference verifier.
Daemon fleet
Grep result: ZERO matches for verify-fix-loop, verifier, fix-builder in ~/system/daemons/.
LaunchAgents
Grep result: ZERO matches in ~/Library/LaunchAgents/.
VERDICT: ABSENT
The verify-fix-loop and its constituent agents (verifier, fix-builder) have zero automated entry points. The only invocation path is a human typing a trigger phrase in a Claude Code conversation. CEO is always in the loop because there is no loop without CEO.
3. Tool-Surface Security Check
Verifier (read-only)
Definition file: ~/.claude/agents/verifier.md
Declared tools: tools: Read, Grep, Glob, Bash
The tools: field includes Bash. This is the critical point.
The agent definition does NOT use a tool whitelist that removes Write/Edit/Task at the API level. It relies entirely on prompt-level enforcement ("Enforcement is prompt-only — this rule is yours to honor. You are the gatekeeper."). The verifier.md explicitly states this.
Permitted Bash commands (per prompt whitelist in verifier.md):
- cat, head, tail, wc, ls, file, stat
- diff, git read-only subcommands
- grep, rg, find (via tool preferred)
- jq, node -e (read-only expression)
- node ~/system/tools/mc.js show (read-only subcommands only — NEVER add|start|done|ready|update|pause|cancel)
- gh pr view, gh issue view, gh api -X GET
- sqlite3 -readonly, psql SELECT only
- curl -sI (HEAD), curl -s GET (never POST/PUT/DELETE)
- bash -n, shellcheck, node --check (dry-run linters)
Escape paths documented:
- The prompt says "NEVER run: rm, mv, cp (to non-/tmp/), chmod, chown, ln" and "Redirections that write outside /tmp/verifier-* or /tmp/<task_id>-evidence/: >, >>, tee to other paths".
- This is prompt-level enforcement only. A model following instructions could still run
bash -c "echo foo > ~/system/some-file.txt"— the agent framework does not block it at the API tool-call level. - The
tools: Bashdeclaration gives the agent full shell access; the prompt whitelist is self-enforced. - Feedback file writes are permitted to
/tmp/verifier-feedback-<TASK_ID>.mdspecifically.
Verdict on verifier tool isolation: Prompt-enforced, not API-enforced. Read-only is a behavioral constraint, not a structural constraint. The risk is manageable for a trusted model, but not cryptographically bounded.
Fix-builder (write-only, scoped)
Definition file: ~/.claude/agents/fix-builder.md
Declared tools: tools: Read, Edit, Grep, Glob
The fix-builder tool list explicitly excludes:
- Write (no new file creation)
- Bash (no test runs, deploys, builds, git ops)
- Task (no further dispatch)
This is stronger isolation than the verifier: the tools: field at the agent definition level excludes Bash and Write. If the agent framework enforces declared tools as a whitelist, fix-builder genuinely cannot run shell commands or create new files. It can only read existing files (Read, Grep, Glob) and apply edits to existing files (Edit).
Gap: Fix-builder cannot create new files even when feedback prescribes it. The skill handles this: "If the feedback prescribes creating a new file, mark that fix as COULD_NOT_APPLY" — the loop escalates. This is a by-design limitation, not a bug.
Verdict on fix-builder tool isolation: Structurally scoped (Bash and Write excluded from tools declaration). This is the correct pattern. The verifier should be refactored to match this approach.
4. Synthetic Dry-Trace
Selected task: MC #99389 — "Refactor /mehanik skill to progressive-disclosure pattern" (status: review, owner: pi-orchestrator)
This task was marked mc.js ready (now review) after pi-orchestrator completed it.
What WOULD have happened if /verify-fix-loop were auto-invoked:
Step 0: trigger fired when pi-orchestrator called mc.js ready #99389
→ /verify-fix-loop mc_id=99389 spec_path=~/.claude/skills/mehanik/SKILL.md
domain=docs (inferred from skill file path)
max_loops=3
Step A (iter 1): dispatch verifier
- verifier reads ~/.claude/skills/mehanik/SKILL.md
- verifier reads MC #99389 ACs via mc.js show 99389
- verifier decomposes ACs into atomic claims:
(a) SKILL.md exists and is < N lines (tier-1 constraint)
(b) references/agent-brief.md exists
(c) references/failure-modes.md exists
(d) Skill tool callable post-refactor
- verifier probes each atom with Read/Glob/Bash
Step B: parse CONFIDENCE
If all files exist and SKILL.md is within limits → PERFECT → SUCCESS
If any reference file missing → FEEDBACK
Step D (if FEEDBACK): dispatch fix-builder
- fix-builder reads /tmp/verifier-feedback-99389.md
- applies Edit to create missing sections or correct line counts
Step C (iter 2): re-verify → likely PERFECT → write SUMMARY.md → SUCCESS
Actual closure path used for MC #99389:
The task is in review status. Looking at the review queue (25+ tasks in review), there is no evidence of verifier invocation. The closure path was: pi-orchestrator marked ready → task sits in review queue → CEO/John is the implicit reviewer. This is the CEO-as-verifier pattern the CEO wants to eliminate.
5. Comparison with Existing Patterns
liveness-claim-validator.sh
- Trigger: PostToolUse hook, fires on every Write/Edit/MultiEdit tool call
- Scope: Memory files, spec files, agent definition files matching 4 path patterns
- Mechanism: Shell script reads tool input JSON from stdin, scans written content for bare liveness claims, blocks write if violations found (exit 2)
- Auto-invoked: YES, unconditionally, at the Claude Code hook level
- Why verify-fix-loop is NOT similarly hooked: The liveness validator is a passive scan that reads content already being written. The verify-fix-loop requires active agent dispatch (spawning sub-agents), which cannot be done from a shell hook. Shell hooks can block tool calls; they cannot spawn conversational agents.
This is the fundamental architectural gap: hooks can intercept tool calls synchronously, but spinning up a verify-fix-loop requires an async agent conversation that the hook system cannot initiate.
evidence-verifier agent
File: ~/.claude/agents/evidence-verifier.md
Declared tools: (not in scope of this read — but confirmed the agent exists)
Auto-invoked: YES, but differently — it is called by mc-ready-gate.sh via the evidence-contract-validator.sh pathway. However, the evidence-contract-validator.sh is a pure shell script that validates JSON schema + file hashes — it does NOT dispatch the evidence-verifier agent. The agent definition exists for manual invocation. The shell script performs a deterministic (non-LLM) validation that is auto-invoked at mc.js ready time.
Pattern difference: The evidence-verifier pattern uses a shell script as the auto-invoke layer (deterministic, no LLM), with the agent definition as a fallback for edge cases. The verify-fix-loop requires LLM reasoning at every step, making shell-script auto-invocation insufficient.
6. Gap Analysis and Fix Proposal (Audit-Level Only)
Root cause of the gap
The verify-fix-loop was designed top-down as a skill (manual invocation). The liveness-claim-validator was designed bottom-up as a hook (automatic). There is no bridge layer that translates "mc.js ready event" → "spawn verify-fix-loop conversation".
The missing component is a postflight agent dispatcher: something that observes the ready event and spawns a verify-fix-loop session as a sub-agent task.
Minimum wiring needed
Option A: PostToolUse hook on mc.js ready (recommended)
| Element | Detail |
|---|---|
| File to modify | ~/.claude/hooks/mc-ready-gate.sh (already fires on mc.js ready) |
| Addition location | After line 196 (all gates passed — currently execs mc.js directly) |
| Trigger | After mc.js ready succeeds, spawn verify-fix-loop as a background Task |
| Mechanism | mc-ready-gate.sh would write a trigger file to /tmp/vfl-trigger-<mc_id>.json containing mc_id + spec_path + domain; a daemon polls this file |
The problem: mc-ready-gate.sh is a synchronous shell script. It cannot spawn a conversational agent (Task dispatch requires a running Claude Code session). It can only write a file.
Option B: pi-orchestrator.js postflight hook (most natural wiring point)
| Element | Detail |
|---|---|
| File to modify | ~/system/kernel/pi-orchestrator.js |
| Addition location | Inside reportCompletion() function, after line ~3900 (after QA gate passes) |
| What to add | A call to write /tmp/vfl-trigger-<task_id>.json with task metadata |
| Trigger | The daemon below polls this and dispatches |
Option C: /task-postflight skill modification (cleanest for H-tasks)
| Element | Detail |
|---|---|
| File to modify | ~/.claude/skills/task-postflight/SKILL.md |
| Addition location | After Section 2 (PROVEO VALIDATION DISPATCH), add Section 2b |
| What to add | Conditional: if Proveo returns PASS AND task domain is docs/system/refactor, dispatch /verify-fix-loop before writing the postflight marker |
| Trigger | Manual invocation of /task-postflight already exists for H/BLOCKER tasks |
| Advantage | Stays within the skill conversation context — Task dispatch works naturally here |
Recommended wiring (Option C + Option B trigger file):
-
Immediate (no new infrastructure): Add a Section 2b to
/task-postflightSKILL.md that dispatches/verify-fix-loopwhen Proveo passes and domain is non-high-stakes. This works today for all tasks that go through/task-postflight. -
Systematic (covers tasks that bypass /task-postflight): Add a trigger file write to
pi-orchestrator.jsreportCompletion(). A lightweight daemon polls/tmp/vfl-trigger-*.jsonfiles and — when a pi-orchestrator session is active — dispatches the verify-fix-loop skill via the existing Claude Code session.
Loop budget recommendation
- Keep MAX_LOOPS = 3 (matches SKILL.md default)
- For postflight auto-invocation, restrict to
docs,system,refactor,polishdomains only - Hard cap: $5 per invocation (already in SKILL.md)
- Add timeout: 5 minutes wall-clock before auto-escalation to CEO
Escalation path when budget exhausted
- Write SUMMARY.md to EVIDENCE_DIR with full loop history
- Call
node ~/system/tools/slack.js send alerts "[VFL-ESCALATED] MC #<id> — N/MAX loops used, last verdict: <CONFIDENCE>"(Slack, not CEO direct) - Set task status to
blockedviamc.js blockwith reason "verify-fix-loop budget exhausted — human review needed" - John receives Slack alert and decides: (a) override + mark done, (b) dispatch additional builder, (c) extend budget via [CEO_APPROVED] token
Open Questions
-
Tool-level enforcement for verifier: Should the verifier's
tools:field be changed fromRead, Grep, Glob, BashtoRead, Grep, Glob(removing Bash) to achieve structural isolation matching fix-builder? This would break the verifier's ability to runcurl -sI,git log,sqlite3 -readonlyprobes — which are core to its value. The tradeoff is behavioral (current) vs structural enforcement. -
Conversation context for auto-dispatch: Spawning a verify-fix-loop Task requires an active Claude Code conversation. If pi-orchestrator fires after a conversation closes, there is no context to spawn into. Does the system need a persistent "factory session" that stays open to receive postflight dispatches?
-
High-stakes domain detection: The SKILL.md defaults unknown domains to HIGH_STAKES (no autonomous correction). For auto-invocation, domain inference from spec path heuristics will frequently return unknown. Should the default be flipped to docs for auto-invoked postflight use cases?
-
Proveo vs verifier: overlap management:
/task-postflightalready dispatches Proveo for AC-checklist QA. If verify-fix-loop is added as Section 2b, tasks will run both Proveo (AC checklist) AND verifier (atomic claims) sequentially. Is this the intended double-verification model, or should one replace the other for certain task types? -
mc.js ready event vs pi-orchestrator ready: Some tasks are marked ready by human John (
node ~/system/tools/mc.js ready <id>), others by pi-orchestrator after build completion, and others by/task-postflight. The auto-invocation wiring point differs for each path. A comprehensive solution needs to intercept all three paths.
Evidence Metadata
| Item | Value |
|---|---|
| Files read | 8 |
| Grep/Bash tool calls | 12 |
| Live agent invocations | 0 |
| Mutations | 0 |
| Wall-clock (estimated) | ~18 min |
| Key source files | ~/.claude/skills/verify-fix-loop/SKILL.md, ~/.claude/agents/verifier.md, ~/.claude/agents/fix-builder.md, ~/.claude/skills/task-postflight/SKILL.md, ~/system/kernel/pi-orchestrator.js (lines 3730–3930), ~/.claude/hooks/mc-ready-gate.sh, ~/.claude/hooks/liveness-claim-validator.sh |
No comments to display
No comments to display