# Incident Postmortem — Bilko Deploy Fix 2026-04-22

<title id="bkmrk-incident-postmortem-">Incident Postmortem — Bilko Deploy Fix 2026-04-22</title></head><body># Incident Postmortem — Bilko Deploy Fix 2026-04-22

**Date:** 2026-04-22  
**Severity:** High (CEO time wasted + security leak)  
**Status:** Resolved  
**Type:** Blameless Postmortem

## Summary

A 2-hour bug fix sprint (MC tasks #8626, #8627, #8628) aimed at fixing 3 bugs in Bilko demo resulted in ZERO live changes reaching the production demo URL (bilko-demo.alai.no). All code changes were pushed to the wrong branch (feat/intesa-bih-demo instead of main), CI pipeline was silently broken for 7 days, and client-specific content (Intesa BiH pitch) leaked to the public demo URL.

## Timeline (UTC+1)

<table id="bkmrk-time-event-actor-202"><thead><tr><th>Time</th><th>Event</th><th>Actor</th></tr></thead><tbody><tr><td>2026-04-21 13:32</td><td>MC #8626 created (invoice template save button broken)</td><td>John</td></tr><tr><td>2026-04-21 13:33</td><td>MC #8627 created (invoice PDF download fails on unsaved invoice)</td><td>John</td></tr><tr><td>2026-04-21 13:33</td><td>MC #8628 created (settings logo upload missing)</td><td>John</td></tr><tr><td>2026-04-21 13:46</td><td>All 3 tasks marked ready\_for\_review (commit d408cc6 + 53fe1d6)</td><td>Brad Frost (Vizu)</td></tr><tr><td>2026-04-22 09:00</td><td>CEO: "Bilko demo nije updatan, bugs jos uvijek tu"</td><td>Alem</td></tr><tr><td>2026-04-22 09:10</td><td>Discovery: All fixes pushed to feat/intesa-bih-demo (no CI on that branch)</td><td>John</td></tr><tr><td>2026-04-22 09:15</td><td>Verification via curl + git log: main unchanged, bilko-demo.alai.no serving old code</td><td>John</td></tr><tr><td>2026-04-22 09:36</td><td>MC #8678 created: /intesa-bridge leak discovered (HTTP 200 on public demo)</td><td>John</td></tr><tr><td>2026-04-22 10:00</td><td>CI investigation: Last 5 runs all failed (since 2026-04-15)</td><td>Kelsey (FlowForge)</td></tr><tr><td>2026-04-22 10:36</td><td>MC #8696 created: ZAKON PI2 Deploy Verification Protocol</td><td>John</td></tr><tr><td>2026-04-22 12:00</td><td>Manual deploy attempt: GitHub PAT missing workflow scope (can't trigger CI fix)</td><td>FlowForge</td></tr><tr><td>2026-04-22 12:50</td><td>Manual docker build + push (CEO hands off to FlowForge)</td><td>Alem + FlowForge</td></tr><tr><td>2026-04-22 21:41</td><td>MC #8730 done: fix-bugs-22apr deployed, all 4 evidence checks pass</td><td>FlowForge</td></tr><tr><td>2026-04-22 21:50</td><td>MC #8678 code fix pushed (66d2220): intesa routes deleted from main</td><td>Brad Frost</td></tr></tbody></table>

## Impact

### User-Facing

- **Bilko demo bugs:** Persisted for 1 extra day (low severity — internal demo, no external users)
- **Intesa content leak:** Unknown duration (potentially days) — BiH bank integration pitch content publicly accessible at /intesa-bridge on bilko-demo.alai.no

### Internal

- **CEO time lost:** ~2 hours (debugging + manual deploy)
- **Trust erosion:** "Validacija ne radi" feedback — John claimed done without verifying live state
- **CI health invisible:** 7 days of broken deploys undetected

## Root Causes (5 Failures)

### 1. Branch Assumption (No Pre-Flight Verification)

**What happened:** John inferred target branch from memory (assumed feat/intesa-bih-demo based on last session), dispatched builder without running `curl -sI` + `git log` to verify which branch serves bilko-demo.alai.no.

**Why it matters:** Wrong branch = wrong deploy target. All fixes landed on isolated feature branch with no CI and no domain mapping.

**Prevention:** ZAKON PI2 Check 2 — 4 pre-flight commands mandatory BEFORE code changes.

### 2. CI Broken for 7 Days Undetected

**What happened:** GitHub Actions workflow failing since 2026-04-15. No one noticed because:

- No daily CI health check in boot.sh
- Manual deploys used as workaround without logging CI status
- `gh run list` not part of standard deploy checklist

**Root cause:**

1. GitHub Actions quota exhausted (monthly minutes limit)
2. `--no-traffic` flag on line 206 of gcp-deploy.yml prevents traffic promotion on existing services

**Prevention:** ZAKON PI2 Check 4 — `gh run list --limit 5` before any push. If 5/5 = failure, STOP and fix CI first.

### 3. Intesa Content Leaked to Public URL

**What happened:** Commit 13c2efb merged `/intesa-bridge` and `/intesa-cockpit` routes to main branch. These were pitch-specific features for Dženana Hardaga (Intesa BiH IT director) and should have remained isolated on bilko-intesa-demo Cloud Run service.

**Why it matters:** Client-specific content (including BiH bank integration mockups) publicly visible on generic demo. Potential NDA violation + confusing UX for non-Intesa visitors.

**Prevention:**

- ZAKON PI2 Check 3 — Branch Purity CI check (`.github/workflows/branch-purity.yml`)
- Client prefix registry in `~/system/rules/client-prefix-registry.md`
- Automated check blocks PR merge if `intesa-*`, `corpint-*`, etc. routes detected on main

### 4. PAT Missing `workflow` Scope

**What happened:** GitHub Personal Access Token used for CI fixes lacked `workflow` scope. FlowForge couldn't push branch-purity.yml or fix gcp-deploy.yml via automation.

**Why it matters:** Blocked automated CI repair. Forced manual workarounds + CEO paste-copy anti-pattern.

**Prevention:** ZAKON PI2 Check 6 — `gh auth status --show-token` at session start. Verify `repo`, `workflow`, `packages:write` scopes present.

### 5. Manual Paste-Copy Anti-Pattern

**What happened:** CEO built docker image locally, pasted output to John, who pasted to FlowForge agent. FlowForge took over from "image already built" state instead of owning full build→push→deploy flow.

**Why it matters:** Process fragmentation = more failure points. Agent can't verify build context, dockerfile, or .dockerignore changes if it didn't run the build.

**Prevention:** Always dispatch FlowForge BEFORE build step. Agent owns entire flow or none of it.

## What Went Well

- **Kelsey persona diagnosis:** FlowForge correctly identified --no-traffic flag as root cause within 10 minutes of investigation
- **ZAKON PI2 authored mid-incident:** Turned incident into system improvement without waiting for postmortem
- **.dockerignore fix:** Reduced build context from 4.1GB → 50MB (8200% improvement) during incident resolution
- **Evidence gate upheld:** MC #8730 not marked done until curl + Playwright + revision checks passed
- **Blameless culture:** No punishment for agents; root cause analysis focused on system gaps

## Action Items

<table id="bkmrk-action-owner-mc-task"><thead><tr><th>Action</th><th>Owner</th><th>MC Task</th><th>Deadline</th><th>Status</th></tr></thead><tbody><tr><td>Sync ZAKON PI2 to BookStack</td><td>pi-orchestrator</td><td>\#8718</td><td>2026-04-23</td><td>PAUSED</td></tr><tr><td>Create DEPLOY-MAP.md in Bilko repo</td><td>Skillforge</td><td>\#8715</td><td>2026-04-23</td><td>DONE</td></tr><tr><td>Bake PI2 checks into pi-orchestrator v2</td><td>pi-orchestrator</td><td>\#8696 (item 3)</td><td>2026-04-29</td><td>IN PROGRESS</td></tr><tr><td>Add pre-deploy hook (~/.claude/hooks/pre-deploy-check.sh)</td><td>pi-orchestrator</td><td>\#8696 (item 4)</td><td>2026-04-29</td><td>DONE</td></tr><tr><td>Patch mc.js done with evidence gate for H-priority deploy tasks</td><td>pi-orchestrator</td><td>\#8696 (item 5)</td><td>2026-04-29</td><td>DONE</td></tr><tr><td>Create client-prefix-registry.md</td><td>pi-orchestrator</td><td>\#8696 (item 7)</td><td>2026-04-29</td><td>DONE</td></tr><tr><td>Fix GitHub Actions quota (upgrade plan or optimize workflows)</td><td>John</td><td>TBD</td><td>2026-05-01</td><td>OPEN</td></tr><tr><td>Remove --no-traffic flag from gcp-deploy.yml for existing services</td><td>FlowForge</td><td>TBD</td><td>2026-04-30</td><td>OPEN</td></tr><tr><td>Upgrade GitHub PAT with workflow scope</td><td>John</td><td>TBD</td><td>2026-04-25</td><td>OPEN</td></tr><tr><td>Weekly CEO audit of mc.js --ceo-override usage</td><td>John</td><td>\#8696 (item 8)</td><td>Ongoing</td><td>OPEN</td></tr></tbody></table>

## Lessons Learned

### For John (Orchestrator)

- **Never infer deploy target from memory.** Always run curl + git log + gh run list before dispatching builder.
- **CI health = system health.** Broken CI for 7 days = broken deployment capability. Monitor actively.
- **Claim verification:** "Task done" without live URL verification = hallucination. CEO was right: "validacija ne radi."

### For Builder Agents (Brad Frost, Vizu)

- **Ready for review ≠ deployed.** Code pushed to branch ≠ code live on target URL. Always verify deploy target match.
- **Client-specific routes:** If building intesa-\*, corpint-\*, etc. — verify target branch is NOT main before merging.

### For FlowForge (DevOps)

- **Own the full flow.** If dispatched for deploy, own build→push→deploy→verify. Don't take over mid-stream from CEO paste-copy.
- **--no-traffic flag:** Only use on first-ever deploy. Never on existing services (blocks traffic promotion).

### System-Level

- **ZAKON PI2 works.** All 5 root causes preventable with 6 hard checks. Enforce at agent level + hook level + MC gate level.
- **Evidence gates prevent false claims.** mc.js enforcement (item 5 of #8696) blocks "done" without verification.json.
- **Blameless postmortems → system rules.** This incident produced ZAKON PI2, DEPLOY-MAP.md standard, and client-prefix-registry. Net positive.

## Related Rules Created

- **ZAKON PI2:** `~/system/rules/zakon-pi2-deploy-verification.md` (BookStack synced)
- **Client Prefix Registry:** `~/system/rules/client-prefix-registry.md`
- **Pre-Deploy Hook:** `~/.claude/hooks/pre-deploy-check.sh`
- **Feedback Log:** `~/.claude/projects/-Users-makinja/memory/feedback_verify_deploy_target_before_code.md`

## Metrics

- **Incident duration:** 32 hours (2026-04-21 13:46 → 2026-04-22 21:41)
- **CEO time lost:** ~2 hours
- **Root causes identified:** 5
- **New rules created:** 4
- **MC tasks spawned:** 10 (parent #8696 + 7 subtasks + 3 original bugs)
- **Lines of ZAKON PI2:** 136
- **Evidence files generated:** 11 (verification.json + 4 PNG + 6 TXT)

## Follow-Up

**Next review:** 2026-04-29 (PI2 implementation deadline)  
**Owner:** John  
**Success criteria:** All 8 items in MC #8696 marked done + CI health green for 7 consecutive days

---

<small>Postmortem by ALAI Skillforge, 2026-04-22  
Credit: ALAI, 2026</small>