# ADR-026 pi-orchestrator reactivation (supersedes ADR-025) — 2026-05-14

## Why This Matters

On 2026-05-14 at 10:14:41, pi-orchestrator successfully picked up and claimed task #100591 — a real MC task — within 30 seconds of being restored. This proves the software **works**. ADR-025 had concluded pi-orch "never worked" and "ran in mock mode," but the real cause was a missing kernel file (deleted, only .bak files remained) and an unloaded plist. The decommission decision was based on a deployment failure, not a software failure. This ADR corrects that record and re-establishes pi-orchestrator as the canonical autonomous poll loop for ALAI's build dispatch surface.

---

# ADR-026 — pi-orchestrator Reactivation as Canonical Autonomous Poll Loop

**Date:** 2026-05-14  
**Status:** ACCEPTED  
**MC:** #100597  
**Decided by:** John (Petter Graff architecture review)  
**Supersedes:** ADR-025 (pi-orchestrator Decommission, 2026-05-09)

---

## Context

ADR-025 (2026-05-09) declared pi-orchestrator decommissioned with the following exact claims:

> "pi-orchestrator ran in mock mode. It never dispatched a real task. Port 8401 was empty at every probe."
> 
> "pi-orch never worked. 50+ days dead, no real dispatch observed in logs. 'No eligible tasks' only."
> 
> "Note: pi-orch was in mock mode. Rollback restores the process, not real dispatch capability."

These claims were **wrong**. The root cause was structural, not behavioral: the kernel file `~/system/kernel/pi-orchestrator.js` had been deleted (only `.bak` files remained on disk) and the plist `com.john.pi-orchestrator` was not loaded in launchd. A dead process with no kernel file and no plist will of course show no activity on port 8401 — that does not mean the software does not work.

**Hivemind RCA (event 67100, 2026-05-14T10:15:58Z):**

> "pi-orchestrator.js was deleted (only .bak files in ~/system/kernel/). plist com.john.pi-orchestrator NOT loaded. Fix: restore bak-race-window-2026-05-08, copy .new plist to active, launchctl load. PID 57544 running. workers=0 in /stats = DAG artefact, not real worker count. MC #100597 closed."

**Restoration (MC #100597, 2026-05-14):**

1. Kernel restored from `~/system/kernel/pi-orchestrator.js.bak-race-window-2026-05-08`.
2. Plist `com.john.pi-orchestrator` loaded via `launchctl load`.
3. Process came up: PID 57544.
4. Within the first 30-second poll cycle, pi-orchestrator picked up task #100591 at `2026-05-14T10:14:41.072Z`.

**Force-close evidence at `/tmp/evidence-100597/`:**

<table id="bkmrk-file-key-fact-verifi"><thead><tr><th>File</th><th>Key fact</th></tr></thead><tbody><tr><td>`verification.json`</td><td>`verified:true, pid:57544, task_picked:"100591"`</td></tr><tr><td>`daemon-stdout-tail.txt`</td><td>Full cycle log — task classified, routing token written, claim acquired</td></tr><tr><td>`launchctl-list.txt`</td><td>`com.john.pi-orchestrator` present and running</td></tr><tr><td>`stats.json`</td><td>`status:ok, uptime:2078s, pipelines total:5 active:1`</td></tr></tbody></table>

**Daemon stdout excerpt (authoritative):**

```
[2026-05-14T10:14:41.072Z] [INFO] Claude OAuth: OK (authenticated)
[2026-05-14T10:14:41.525Z] [DEBUG] Delegation filter: picked task #100591 (route=post-build)
[2026-05-14T10:14:41.541Z] [INFO] Found task #100591: Skillforge: RCA + runbook for pi-orch route restoration
[2026-05-14T10:14:59.007Z] [INFO] [orch] Blueprint available: flowforge-infra.yaml (FlowForge)
[2026-05-14T10:14:59.223Z] [INFO] Task #100591 claimed by pi-orchestrator (session=pi-orch-57544-1778753679888)

```

This is not mock mode. This is a real classification, a real routing-token write, and a real MC claim against a live task.

---

## Decision

**pi-orchestrator is the canonical autonomous poll loop for ALAI's build dispatch surface.**

ADR-025's decommission is revoked in full. The claims that pi-orch "never worked" and "ran in mock mode" are retracted — they described a broken deployment state, not the software itself.

### Canonical topology

<table id="bkmrk-property-value-kerne"><thead><tr><th>Property</th><th>Value</th></tr></thead><tbody><tr><td>Kernel file</td><td>`~/system/kernel/pi-orchestrator.js`</td></tr><tr><td>Plist</td><td>`com.john.pi-orchestrator`</td></tr><tr><td>LaunchAgent path</td><td>`~/Library/LaunchAgents/com.john.pi-orchestrator.plist`</td></tr><tr><td>HTTP port</td><td>8401</td></tr><tr><td>Poll interval</td><td>30 s (`pollIntervalMs: 30000` in config)</td></tr><tr><td>Config</td><td>`~/system/config/pi-orchestrator-config.json`</td></tr><tr><td>Mandatory routing</td><td>Enabled — all build tasks touching `~/projects/*` MUST route through pi-orchestrator</td></tr><tr><td>Anti-hallucination hook</td><td>`~/.claude/hooks/hallucination-detector.py` injected into every agent context</td></tr></tbody></table>

### Relationship to durable-runner (port 3052)

ADR-025 attempted to collapse the system to a single surface (durable-runner only). That was correct as an architectural instinct — dual dispatch surfaces do add complexity. However, the two processes serve **different roles**:

- **pi-orchestrator (8401):** autonomous poll loop. Finds eligible tasks, classifies them, routes to the correct specialist tier (Ollama C1/C2, Claude Sonnet C3-C5), writes routing tokens, manages concurrency, enforces quality gates.
- **durable-runner (3052):** event-driven bridge. Receives `mc.js start` events and spawns agents on demand.

These are complementary, not duplicates. Both stay active. This is a design, not an accident.

---

## Consequences

### Immediate

1. `com.john.pi-orchestrator` stays loaded. Do not unload it.
2. `~/system/kernel/pi-orchestrator.js` is a **critical asset**. Do not delete it. `.bak` retention proved its worth — the entire restoration depended on `bak-race-window-2026-05-08`.
3. Any audit or documentation referencing ADR-025 as authoritative MUST be re-evaluated against this ADR. ADR-025 is superseded.

### Operational protections required

<table id="bkmrk-protection-rationale"><thead><tr><th>Protection</th><th>Rationale</th></tr></thead><tbody><tr><td>Fleet watchdog must assert `pi-orchestrator.js` present in `~/system/kernel/`</td><td>File deletion was the root cause of the 50-day outage. Watchdog would have caught this immediately.</td></tr><tr><td>`.bak` retention policy: keep at minimum the last `bak-race-window-*` snapshot</td><td>This specific backup was the only recovery path. Without it, 50+ days of config evolution would have been lost.</td></tr><tr><td>Plist presence check in daemon-fleet watchdog</td><td>`launchctl list | grep pi-orchestrator` returning nothing must trigger an alert, not silence.</td></tr><tr><td>No agent may unload `com.john.pi-orchestrator` without an explicit CEO decision</td><td>The plist was unloaded as a side effect of ADR-025, which was itself based on a misdiagnosis. Unloading a core daemon must be a named, deliberate act.</td></tr></tbody></table>

### Lesson: distinguish deployment failure from software failure

ADR-025 diagnosed a deployment failure (kernel file missing + plist unloaded) as a software failure ("never worked"). This is a class of error: inferring capability from a broken runtime state. Before declaring a daemon non-functional, the diagnostic checklist is:

1. Is the kernel/binary present on disk?
2. Is the plist loaded in launchd?
3. Is the process running (PID)?
4. Only then: is the process behaving correctly?

ADR-025 checked step 4 (port 8401 empty, logs show "No eligible tasks") without first verifying steps 1 and 2. That is the failure mode that produced the wrong conclusion.

---

## What Is NOT Changed

- `com.alai.orchestrator-bridge` (durable-runner, port 3052) — remains active. Its role as event-driven spawn bridge is unchanged.
- `~/system/config/pi-orchestrator-config.json` — unchanged. Config was valid throughout; the problem was never configuration.
- The `.bak` kernel files in `~/system/kernel/` — preserved. See fleet watchdog protection above.
- ZAKON PI2 deploy verification — unaffected.

---

## Rollback

If pi-orchestrator must be decommissioned again in the future, the following conditions must all be true before proceeding:

1. A named CEO decision MC exists (not a John autonomous call).
2. A functional alternative handles autonomous poll-loop dispatch.
3. The kernel file is archived, not deleted.
4. The plist is archived, not deleted.
5. A named MC documents the restoration path.

A diagnosis of "port is empty" or "no tasks in logs" is NOT sufficient grounds for decommission without first verifying kernel file presence and plist load state.

---

## See Also

- ADR-025: `~/system/specs/adr-025-pi-orch-decommission-2026-05-09.md` (superseded)
- MC #100597 — pi-orchestrator restore
- MC #100591 — first task dispatched post-restore (Skillforge RCA + runbook)
- Hivemind event: `~/system/agents/hivemind/events/1778753758640-67100.json`
- Evidence: `/tmp/evidence-100597/`
- Config: `~/system/config/pi-orchestrator-config.json`
- Fleet watchdog state: `~/system/state/daemon-fleet-status.json`