ADR-026 pi-orchestrator reactivation (supersedes ADR-025) — 2026-05-14
Why This Matters
On 2026-05-14 at 10:14:41, pi-orchestrator successfully picked up and claimed task #100591 — a real MC task — within 30 seconds of being restored. This proves the software works. ADR-025 had concluded pi-orch "never worked" and "ran in mock mode," but the real cause was a missing kernel file (deleted, only .bak files remained) and an unloaded plist. The decommission decision was based on a deployment failure, not a software failure. This ADR corrects that record and re-establishes pi-orchestrator as the canonical autonomous poll loop for ALAI's build dispatch surface.
ADR-026 — pi-orchestrator Reactivation as Canonical Autonomous Poll Loop
Date: 2026-05-14
Status: ACCEPTED
MC: #100597
Decided by: John (Petter Graff architecture review)
Supersedes: ADR-025 (pi-orchestrator Decommission, 2026-05-09)
Context
ADR-025 (2026-05-09) declared pi-orchestrator decommissioned with the following exact claims:
"pi-orchestrator ran in mock mode. It never dispatched a real task. Port 8401 was empty at every probe."
"pi-orch never worked. 50+ days dead, no real dispatch observed in logs. 'No eligible tasks' only."
"Note: pi-orch was in mock mode. Rollback restores the process, not real dispatch capability."
These claims were wrong. The root cause was structural, not behavioral: the kernel file ~/system/kernel/pi-orchestrator.js had been deleted (only .bak files remained on disk) and the plist com.john.pi-orchestrator was not loaded in launchd. A dead process with no kernel file and no plist will of course show no activity on port 8401 — that does not mean the software does not work.
Hivemind RCA (event 67100, 2026-05-14T10:15:58Z):
"pi-orchestrator.js was deleted (only .bak files in ~/system/kernel/). plist com.john.pi-orchestrator NOT loaded. Fix: restore bak-race-window-2026-05-08, copy .new plist to active, launchctl load. PID 57544 running. workers=0 in /stats = DAG artefact, not real worker count. MC #100597 closed."
Restoration (MC #100597, 2026-05-14):
- Kernel restored from
~/system/kernel/pi-orchestrator.js.bak-race-window-2026-05-08. - Plist
com.john.pi-orchestratorloaded vialaunchctl load. - Process came up: PID 57544.
- Within the first 30-second poll cycle, pi-orchestrator picked up task #100591 at
2026-05-14T10:14:41.072Z.
Force-close evidence at /tmp/evidence-100597/:
| File | Key fact |
|---|---|
verification.json |
verified:true, pid:57544, task_picked:"100591" |
daemon-stdout-tail.txt |
Full cycle log — task classified, routing token written, claim acquired |
launchctl-list.txt |
com.john.pi-orchestrator present and running |
stats.json |
status:ok, uptime:2078s, pipelines total:5 active:1 |
Daemon stdout excerpt (authoritative):
[2026-05-14T10:14:41.072Z] [INFO] Claude OAuth: OK (authenticated)
[2026-05-14T10:14:41.525Z] [DEBUG] Delegation filter: picked task #100591 (route=post-build)
[2026-05-14T10:14:41.541Z] [INFO] Found task #100591: Skillforge: RCA + runbook for pi-orch route restoration
[2026-05-14T10:14:59.007Z] [INFO] [orch] Blueprint available: flowforge-infra.yaml (FlowForge)
[2026-05-14T10:14:59.223Z] [INFO] Task #100591 claimed by pi-orchestrator (session=pi-orch-57544-1778753679888)
This is not mock mode. This is a real classification, a real routing-token write, and a real MC claim against a live task.
Decision
pi-orchestrator is the canonical autonomous poll loop for ALAI's build dispatch surface.
ADR-025's decommission is revoked in full. The claims that pi-orch "never worked" and "ran in mock mode" are retracted — they described a broken deployment state, not the software itself.
Canonical topology
| Property | Value |
|---|---|
| Kernel file | ~/system/kernel/pi-orchestrator.js |
| Plist | com.john.pi-orchestrator |
| LaunchAgent path | ~/Library/LaunchAgents/com.john.pi-orchestrator.plist |
| HTTP port | 8401 |
| Poll interval | 30 s (pollIntervalMs: 30000 in config) |
| Config | ~/system/config/pi-orchestrator-config.json |
| Mandatory routing | Enabled — all build tasks touching ~/projects/* MUST route through pi-orchestrator |
| Anti-hallucination hook | ~/.claude/hooks/hallucination-detector.py injected into every agent context |
Relationship to durable-runner (port 3052)
ADR-025 attempted to collapse the system to a single surface (durable-runner only). That was correct as an architectural instinct — dual dispatch surfaces do add complexity. However, the two processes serve different roles:
- pi-orchestrator (8401): autonomous poll loop. Finds eligible tasks, classifies them, routes to the correct specialist tier (Ollama C1/C2, Claude Sonnet C3-C5), writes routing tokens, manages concurrency, enforces quality gates.
- durable-runner (3052): event-driven bridge. Receives
mc.js startevents and spawns agents on demand.
These are complementary, not duplicates. Both stay active. This is a design, not an accident.
Consequences
Immediate
com.john.pi-orchestratorstays loaded. Do not unload it.~/system/kernel/pi-orchestrator.jsis a critical asset. Do not delete it..bakretention proved its worth — the entire restoration depended onbak-race-window-2026-05-08.- Any audit or documentation referencing ADR-025 as authoritative MUST be re-evaluated against this ADR. ADR-025 is superseded.
Operational protections required
| Protection | Rationale |
|---|---|
Fleet watchdog must assert pi-orchestrator.js present in ~/system/kernel/ |
File deletion was the root cause of the 50-day outage. Watchdog would have caught this immediately. |
.bak retention policy: keep at minimum the last bak-race-window-* snapshot |
This specific backup was the only recovery path. Without it, 50+ days of config evolution would have been lost. |
| Plist presence check in daemon-fleet watchdog | launchctl list | grep pi-orchestrator returning nothing must trigger an alert, not silence. |
No agent may unload com.john.pi-orchestrator without an explicit CEO decision |
The plist was unloaded as a side effect of ADR-025, which was itself based on a misdiagnosis. Unloading a core daemon must be a named, deliberate act. |
Lesson: distinguish deployment failure from software failure
ADR-025 diagnosed a deployment failure (kernel file missing + plist unloaded) as a software failure ("never worked"). This is a class of error: inferring capability from a broken runtime state. Before declaring a daemon non-functional, the diagnostic checklist is:
- Is the kernel/binary present on disk?
- Is the plist loaded in launchd?
- Is the process running (PID)?
- Only then: is the process behaving correctly?
ADR-025 checked step 4 (port 8401 empty, logs show "No eligible tasks") without first verifying steps 1 and 2. That is the failure mode that produced the wrong conclusion.
What Is NOT Changed
com.alai.orchestrator-bridge(durable-runner, port 3052) — remains active. Its role as event-driven spawn bridge is unchanged.~/system/config/pi-orchestrator-config.json— unchanged. Config was valid throughout; the problem was never configuration.- The
.bakkernel files in~/system/kernel/— preserved. See fleet watchdog protection above. - ZAKON PI2 deploy verification — unaffected.
Rollback
If pi-orchestrator must be decommissioned again in the future, the following conditions must all be true before proceeding:
- A named CEO decision MC exists (not a John autonomous call).
- A functional alternative handles autonomous poll-loop dispatch.
- The kernel file is archived, not deleted.
- The plist is archived, not deleted.
- A named MC documents the restoration path.
A diagnosis of "port is empty" or "no tasks in logs" is NOT sufficient grounds for decommission without first verifying kernel file presence and plist load state.
See Also
- ADR-025:
~/system/specs/adr-025-pi-orch-decommission-2026-05-09.md(superseded) - MC #100597 — pi-orchestrator restore
- MC #100591 — first task dispatched post-restore (Skillforge RCA + runbook)
- Hivemind event:
~/system/agents/hivemind/events/1778753758640-67100.json - Evidence:
/tmp/evidence-100597/ - Config:
~/system/config/pi-orchestrator-config.json - Fleet watchdog state:
~/system/state/daemon-fleet-status.json
No comments to display
No comments to display