# Daemon Fleet — dr-sync & tldr-watch fix (MC #104330)

# MC #104330 — fleet-watchdog alert resolution

Alert: `[FLEET-WATCHDOG] 2026-06-25T06:36:57Z — CRITICAL: 2 daemons in failed state: com.john.dr-sync, com.john.tldr-watch`

## Root cause 1 — com.john.dr-sync (rsync exit 20)

The rsync exclude pattern `*.bak` does not match backup files named `*.bak-<suffix>`.
An 18G stale backup `mission-control.db.bak-pre-p2p-correction-20260529` (live db is 35M) was
being rsynced to the mac-mini every 6h; the oversized transfer kept getting interrupted (exit 20),
so the `databases` target failed (8/9 success) and the daemon exited non-zero.

**Fix:** `~/system/daemons/dr-sync.sh` — added `--exclude=*.bak-*` and `--exclude=*.bak[0-9]*`.

**Proof:**

- Directory-mode dry-run: 18G file NOT in transfer list; live `.db` files still sync.
- launchd kickstart run: LastExitStatus = `0`.
- Log `2026-06-26 10:46:15`: `Total targets: 9 | Success: 9 | Failed: 0 | Duration: 17s` (was 358s).

## Root cause 2 — com.john.tldr-watch (exit 2)

Not a crash. tldr-watch is a health-monitor that exits `2` BY DESIGN when verdict=FAIL
(script lines 119-122), and it owns its own alert path (#exec Slack + HiveMind intel).
The fleet-watchdog only whitelisted exit `1/256`, so tldr-watch's issue-found exit `2/512`
was misclassified as a failed daemon.

**Fix:** `~/bin/daemon-fleet-watchdog.sh` — added `com.john.tldr-watch` to `EXIT1_NORMAL`
and extended allowed issue-found codes to `1/2/3` (+ launchd-encoded `256/512/768`).

**Proof:** reclassification against live `daemon-fleet-status.json` → tldr-watch no longer critical.

## End-to-end verification (L2+)

fleet-watchdog run `2026-06-26T08:46:40Z`:

- `com.john.dr-sync: calendar_err_256 → calendar_ok`
- NO `CRITICAL: N daemons in failed state` line (present on every prior run)
- err count 4 → 2

## Follow-ups (separate, non-blocking)

1. **Disk hygiene:** 18G stale backup still on disk (96% full / 42G free). Recommend CEO-approved
   deletion of `mission-control.db.bak-pre-p2p-correction-20260529`. Not deleted unilaterally
   (irreversible, not self-created).
2. **TLDR pipeline dormant:** tldr-watch's FAIL is real — actionizer produces 0 insights/0 tasks
   daily, db counts static at `620,8,612,8` since ≥06-23. Decide: revive or retire tldr-briefing/actionizer.