Lessons Learned
Lessons Learned — Accumulated Knowledge
2026-04-04: P0 Endpoint Hallucination — LightRAG /upload vs /documents/text
Problem: Builder agent hallucinated /upload endpoint for LightRAG when correct endpoint is /documents/text. Error passed through entire system without detection — code written, deployed, and failed in production demo.
Root Cause Analysis:
- Agent assumed endpoint name based on "sounds right" pattern matching
- No endpoint verification hook in place
- qa-19 quality gate lacked endpoint testing
- Tool-registry.db was inactive — no nightly endpoint audit
Impact: Demo-blocking bug in LumisCare, revealed systemic vulnerability to API hallucinations.
Solution (3-Part):
-
P1: Anti-Hallucination Hook — hallucination-detector.py now has KNOWN_API_ENDPOINTS dict + check_phantom_endpoints()
- Blocks Write/Edit with known invalid endpoints
- Examples:
/upload→ use/documents/text
-
P2: Nightly Audit Daemon — tool-sync-audit.js scans all tools for stale endpoints
- Tests each HTTP endpoint via HEAD (timeout 3s)
- Logs to health-events.db
- Alerts Slack if stale endpoints found
- LaunchAgent: com.john.tool-sync-audit (03:00 daily)
-
P3: Quality Gate Check — qa-19.js now includes Check #20: Endpoint Verification
- Parses GOTCHA for HTTP endpoints
- Tests each before task completion
- Blocks mc.js done if endpoints fail
Pravilo (Rule 10 — agent-anti-hallucination.md):
Before using any HTTP endpoint:
1. curl -s http://localhost:PORT/health
2. Check OpenAPI spec: curl -s http://localhost:PORT/openapi.json
3. Verify in KNOWN_API_ENDPOINTS (hallucination-detector.py)
NEVER assume endpoint exists because it "sounds right"
Prevention for future:
- hallucination-detector.py checks all Write/Edit for phantom endpoints
- tool-sync-audit.js catches stale endpoints weekly
- qa-19 Check #20 blocks tasks with dead endpoints
- enforcement.json enforces endpoint_check blocking
Lekcija: API hallucinations are deterministic errors — agent + endpoint name that sounds right = confident wrong code. Solution is three-layer: hook prevention + nightly audit + quality gate. Builder agent can't self-verify, so verification must be external + automated.
2026-02-12: NIKAD BUILD od self-generated spec-a bez CEO approval
Problem: John je na DROP projektu sam napravio UI/UX spec (competitor analysis, 3 dizajn opcije), pa odmah krenuo graditi full app — 97 fajlova, 24K LOC. Bez ijednog Alemovog odobrenja na spec. Rezultat: kod zaglavljen na wrong git branch, prazan drop-app/ dir, wasted tokens, Alem ne zna šta je napravljeno.
Root Cause: Nedostajao approval gate između faze Research/Spec i faze Build. John je tretirao self-generated spec kao odobren spec.
Pravilo (ZAKON):
- Research → OK, radi slobodno
- Spec/Proposal draft → OK, radi slobodno
- BUILD → STOP. Explicit CEO odobrenje na spec PRIJE prvog LOC.
- Ako CEO nije pregledao spec, spec NE POSTOJI kao basis za build.
- Self-generated spec ≠ Approved spec. NIKAD.
Recovery: fontelepay/ auto-backup branch merged to master. Kod recovered.
Fix nivo: Rule (ovaj fajl) + HiveMind (#76) + MEMORY.md. Idealan fix bi bio hook koji blokira build bez approved spec — ali approval gate je human decision, teško za hook.
Lekcija: AI može napraviti spec, ali samo čovjek može ODOBRITI spec. Bez odobrenja, build je gubitak resursa.
2026-02-08: Next Steps MORAJU postati MC taskovi
Problem: Session log imao "Next Steps" ali nikad nisam kreirao MC taskove za njih. Rezultat: 2 akcije (Edita MC onboarding + Mini SSH update) izgubljene jer niko ne čita session log automatski.
Root Cause: Session-save workflow zapisuje next steps u markdown ali nema korak koji ih pretvara u MC taskove.
Fix: PRAVILO — prije kraja sesije, svaki "Next Step" iz session state-a MORA postati mc.js add task. Session state je za kontekst, MC je za akciju. Ako nije u MC-u, ne postoji.
Lekcija: Passive documentation (markdown) ≠ active tracking (MC). Ako nešto treba biti urađeno, mora biti task.
2026-02-04: Task Management + Problem Solving Enforcement
Problem: Skip-ovao sam task tracking i problem solving proces, delegirao agenta bez proper requirements gathering, agent riješio pogrešan problem.
Root Cause Analysis:
- Nisam dodao task u tasks.db
- Nisam pratio problem-solving.md proces (koraci 1-6)
- Spawn-ovao agenta sa PRVIM rješenjem (email infrastructure umjesto client communication system)
- Agent radio PLAN fazu solo - trebalo John + client
- Nisam završio Next Steps iz SESSION-STATE
Impact: Alem dobio pogrešno rješenje, izgubljeno vrijeme, "veći problem" kreiran
Solution Implemented:
- ✅ Kreiran ~/system/tools/start-task.sh - mandatory validation script
- ✅ Update MEMORY.md sa CORE PROTOCOL sekcijom
- ✅ Dokumentovano u lessons-learned.md (ovdje)
- ✅ boot.sh reminder dodan
Validation:
- start-task.sh blokira izvršenje ako nisu zadovoljeni koraci 1-4
- Checklist forsira: task.db entry → problem solving (1-6) → company delegation check
- MEMORY.md učitava se na session start sa reminder-om
Prevention:
- NIKAD ne radim ništa bez
bash ~/system/tools/start-task.shprvo - Script deterministic - ne mogu skip-ovati
- boot.sh prikazuje reminder na session start
Key Mantras:
- "Prvo rješenje" ≠ "Najbolje rješenje"
- Research PRIJE implementacije (WebSearch 2+ izvora)
- 2-3 opcije UVIJEK, ne samo jedna
- PLAN phase = John + client, ne agent solo
2026-02-12: Sub-agent Validator Hallucination — "PASS" na pogrešan format
Problem: John mijenjao Claude Code hooks format u .claude/settings.json. Napisao matcher: {} (objekt) umjesto matcher: "*" (regex string). Pozvao haiku sub-agenta kao "testera" — agent rekao PASS. Alem pokrenuo Claude, dobio isti error.
Root Cause (2 nivoa):
- John nije pročitao dokumentaciju prije izmjene config formata. Pretpostavio format iz error poruke.
- Sub-agent validirao John-ov output umjesto da nezavisno provjeri spec. Haiku agent nema znanje o novom hooks formatu — hallucinate-ovao da je ispravan.
Impact: Alem dobio error 2x, izgubljeno povjerenje u "tester" agente.
Fix:
- Pravilo: NIKAD mijenjaj config/schema format bez čitanja oficijelne dokumentacije (WebFetch/WebSearch)
- Pravilo: Validator/tester sub-agent MORA imati instrukciju da NEZAVISNO provjeri source of truth (docs URL, spec file), NE da validira caller-ov rad
- Anti-pattern: "Provjeri da sam dobro uradio" ≠ testiranje. Testiranje = nezavisna verifikacija protiv spec-a.
Key Mantras:
- Docs first, code second
- Validator ≠ rubber stamp
- Haiku ne zna ono što ne zna — ne koristi ga za format verifikaciju bez docs referenci
2026-02-16: UI promjene bez prethodne provjere dizajn referenci (Drop #979)
Problem: Landing page imao "Virtuelt kort" feature koji je kontradiktoran Drop PSD2 pass-through modelu (no cards, no wallet). Kad sam to fixovao u "Kontooversikt", napravio sam promjenu BEZ prethodne provjere Make exporta. Alem morao eksplicitno reći: "Jeli li validirao imas vizuelno u MAKE pa tako treba da je i UI."
Root Cause: Dva propusta:
- Niko nije validirao original — "Virtuelt kort" je ušao u kod bez provjere protiv Make dizajna koji NEMA Cards screen
- Fix bez referenci — Ja sam fixovao sadržaj iz glave umjesto da prvo pročitam Make export i repliciram TAČNO šta je tamo
Impact: Srećom output je bio tačan (Make JESTE imao BankAccounts, ne Cards), ali proces je bio pogrešan. Da je Make imao nešto drugačije, ja bih opet deployao pogrešno.
Fix:
- Drop CLAUDE.md — Dodan "UI Source of Truth" sekcija sa Make export putanjom i pravilom "BEFORE any UI change, read Make component"
- visual-verification.md — Dodan korak 0: "REFERENCA PRIJE KODA" — zabranjeno mijenjati UI pa tek onda provjeriti dizajn
- HiveMind — Logirano za budući kontekst
Lekcija: Redoslijed je uvijek: dizajn → kod → verifikacija. Nikad: kod → (možda) verifikacija.
2026-02-16: UVIJEK koristi official brand template za firmine dokumente
Problem: Kreirao PDF za SpareBank 1 pitch i poslao Alemu. Prvo poslao markdown umjesto PDF-a. Onda napravio PDF sa pogrešnim bojama (#0B6E35 Drop green umjesto #00E5A0 ALAI green), pogrešnim cover dizajnom (light umjesto dark navy), bez korištenja official template-a. Alem: "Gdje si nasao ovaj template u ALAI? TO nije pravi."
Root Cause: Nema pravilo koje forsira provjeru brand guidelines i template-a PRIJE kreiranja bilo kakvog firmino-brendiranog dokumenta. John je improvizirao dizajn umjesto da pročita brand-guidelines.md i pogleda template slike.
Pravilo (ZAKON):
- SVAKI dokument sa ALAI branding mora PRVO pročitati
~/ALAI/brand/brand-guidelines.md - SVAKI dokument mora koristiti official boje: Primary Green #00E5A0, Dark Navy #0F172A
- SVAKI PDF mora vizualno odgovarati template-ima iz
~/ALAI/brand/templates/(presentation.png za prezentacije, letter.png za pisma, invoice.png za fakture) - NIKAD ne improvizuj brand — ako ne znaš kako izgleda, PROČITAJ template prije nego počneš
- GOTCHA C (Context) sekcija za branded dokumente MORA sadržavati "brand-guidelines.md read" i navesti tačne boje
Brand Quick Reference:
- Primary Green:
#00E5A0(NE #0B6E35 — to je Drop green) - Dark Navy:
#0F172A(cover background) - Bright Green:
#22C55E(accent) - Font: Inter (Regular, Medium, SemiBold, Bold)
- Logo:
~/ALAI/brand/alai-logo-primary.png - Templates:
~/ALAI/brand/templates/ - Footer: "ALAI Holding AS · Org.nr 932 516 136 · [email protected] · alai.no"
Fix nivo: Rule (ovaj fajl) + HiveMind + MEMORY.md
Lekcija: Branded dokument bez brand guidelines = amaterski. Uvijek čitaj guidelines PRIJE dizajna, nikad poslije.
2026-02-16: Agent .md hooks: sekcija OVERRIDUJE globalne hookove
Problem: Builder agent za task #1039 napisao kod bez GOTCHA checkliste. Validator potvrdio: /tmp/gotcha-task-1039.md — NOT FOUND. gotcha-enforcer.py nikad nije blokirao jer se nikad nije pokrenuo.
Root Cause: Agent .md fajlovi (builder.md, frontend-builder.md, backend-builder.md, design-builder.md) imali hooks: sekciju u YAML frontmatteru. Kad agent definira hooks — to ZAMIJENI globalne hookove iz settings.json, NE merge-uje ih. Rezultat: SVE PreToolUse enforcement hookove (gotcha-enforcer, plan-enforcer, security-guard, hallucination-detector, pii-scanner) su zaobiđeni.
Impact: 4 agenta radila bez ikakvog enforcement-a. Ironično, design-validator (jedini hook u agent .md) je VEĆ bio registrovan globalno u settings.json — lokalne kopije su bile duplikati koji su samo blokirali ostale hookove.
Fix:
- Uklonjene
hooks:sekcije iz sva 4 agenta (builder, frontend-builder, backend-builder, design-builder) - Svi agenti sada nasljeđuju SVE globalne hookove iz settings.json
- design-validator ostaje u globalnom PostToolUse (settings.json linija 142-147)
- Backup:
~/system/backups/setup-changelog/20260216-184634/
Pravilo (ZAKON):
- NIKAD ne dodavaj
hooks:sekciju u agent .md fajlove — uvijek koristi globalni settings.json za hookove - Ako agent treba specifičan hook — dodaj ga u globalni settings.json sa odgovarajućim matcher-om
- Agent .md definira samo: name, model, tools — NIKAD hooks
Fix nivo: Deterministic (uklanjanje hooks: iz agent .md) + Rule (ovaj fajl) + HiveMind (#7191) + CHANGELOG
Vercel Deployment
- NE koristi stare
builds+routesu vercel.json — koristi moderni pristup:{ "outputDirectory": "public" } - API folder
/apise automatski detektuje — ne treba build config - Environment variables moraju biti na PRAVOM projektu — provjeri
vercel env ls
Resend Email
- Custom domena zahtijeva DNS verifikaciju u Resend dashboardu
- API key mora biti na istom Vercel projektu gdje je API endpoint
- Testiranje: 404 = deploy config problem. 500 = API key/domena problem.
Telegram Bot Auth
- NEVER use direct API key for Telegram bot — use Claude CLI spawn (OAuth)
- API keys run out of credits, OAuth doesn't
- Always verify auth method when implementing bot changes
- Bot file:
~/system/comms/telegram-claude-bridge.js - LaunchAgent:
~/Library/LaunchAgents/com.john.telegram-bot.plist
General
- Verify tool output format before chaining into another tool
- Don't assume APIs support batch operations — check first
- When a workflow fails mid-execution, preserve intermediate outputs before retrying
- Provjeri pravi projekt prije dodavanja env vars
- Test endpoint nakon svakog deploya
Background Agents & Security Hooks
- Background agenti (
run_in_background: true) nemaju write permissije — security hook blokira Write, Edit i Bash - Koristi background agente SAMO za research, audit, čitanje — nikad za pisanje fajlova
- Ako background agent treba nešto napisati, vrati rezultat u glavnu sesiju i piši odatle
- Naučeno: EVApp background agent nije mogao kreirati fajlove jer je hook blokirao — morali smo ručno iz glavne sesije
Testing
- "HTML exists" ≠ "It works"
- grep/curl is NOT a visual test
- Automatski testovi su supplement, NE zamjena za vizuelni QA
2026-02-04: Problem-Solving Enforcement System
Problem: John preskakao CORE PROTOCOL - išao direktno na implementaciju bez analize.
Root cause: Validation flag bio statičan, nikad se nije resetovao.
Rješenje implementirano:
boot.shbriše/tmp/claude-task-validatedna početku sesijesecurity-guard.pytraži problem-solving dokumentaciju u/tmp/claude-problem-solving.md- Dokumentacija mora imati 5 sekcija: PROBLEM, RESEARCH, OPCIJE, EVALUACIJA, ODLUKA
- Bootstrap exception: Write dozvoljeno SAMO na problem-solving fajl
- Kad dokumentacija kompletna → auto-validacija → flag kreiran
Workflow:
- Nova sesija → flag resetovan → blokirani Write/Edit/Bash
- Ja dokumentiram proces → hook provjerava → auto-validates
- Tek onda mogu implementirati
Fajlovi izmijenjeni:
~/system/boot.sh- dodano brisanje flaga~/.claude/hooks/security-guard.py- dodana problem-solving validacija
Lekcija: Enforcement mora biti automatski i neizbježan. Ako se može preskočiti, bit će preskočen.
2026-02-04: Hooks Can Only Approve/Block, NOT Modify
Problem: Agent-protocol-enforcer.py vraćao updatedInput misleći da će Claude Code koristiti modificirani prompt. Agenti su i dalje pitali tehnicka pitanja.
Root Cause: updatedInput nije podržan u Claude Code hooks API. Hooks mogu samo:
exit 0→ approve (allow tool call)exit 2→ block (reject tool call with stderr message)
Hooks su GATE kontrola, ne transformacija.
Fix:
- Hook sada BLOKIRA Task bez CORE PROTOCOL markera
- John mora eksplicitno dodati protokol u svaki agent prompt
- Built-in tipovi (Explore, Plan, Bash) su izuzeti - imaju svoje instrukcije
Fajl: ~/.claude/hooks/agent-protocol-enforcer.py
Lekcija: Ne pretpostavljaj da feature postoji. Testiraj da hook STVARNO radi kako misliš.
2026-02-04: DocuSeal — Paid Only
Problem: Koristili DocuSeal za digitalni potpis NDA/ugovora sa Wizard NUF-om. Nije radilo.
Root Cause: DocuSeal nema free plan - zahtijeva plaćenu pretplatu za production use.
Impact: Wizard NUF onboarding ostao bez potpisanih dokumenata. Pipeline testiran ali faza 3 (NDA) i 5 (Contract) nisu kompletne.
Next: Task #52 - naći alternativu za digitalni potpis koja ima free tier ili je self-hosted.
Lekcija: Prije integracije sa SaaS alatom, provjeri pricing i limits. "Free trial" ≠ "Free tier".
2026-02-17: Preskočen /hop-build pipeline — output ne valja (Drop #1309)
Problem: Task #1309 (Drop mobile production build) — John je preskočio /hop-build pipeline. Umjesto toga: ručno spawnao 3 builder agenta paralelno, napisao surface-level GOTCHA checklist samo da prođe hook, nije koristio validator agente. Rezultat: Alem dobio APK koji "ne valja". ZAKON #0 prekršen OPET.
Root Cause (iz analize):
- Nema enforcement za /hop-build — gotcha-enforcer provjerava GOTCHA checklist ali NE provjerava da li je hop-build PROCES korišten
- Skill invocation je dobrovoljna — nema hook koji detektuje "trebao si koristiti /hop-build ali nisi"
- Builder spawn bez process state — orchestrator-delegation-enforcer dozvoljava direktan builder spawn, ne razlikuje "via hop-build" od "ručno"
- MEDIUM priority nema plan enforcer — plan-enforcer.py zahtijeva plan JSON samo za HIGH priority
Impact: 3 builder agenta radila bez proper plana, bez validatora, bez verifikacijske faze. Output deployovan na Expo bez validacije. Alem eksplicitno rekao: "ovo sto si mi dao ne valja" i "kreni ispočetka".
Fix (tiered):
- Hook (WARNING): gotcha-enforcer.py CHECK 5 — warn kad MEDIUM+ task nema
/tmp/hop-build-started-{id}marker - Skill update: /hop-build Phase 1 sad kreira marker fajl
- ZAKON #5: "Svaki implementation task MORA koristiti /hop-build" (MEMORY.md)
- Lessons-learned: Ovaj zapis
Zašto WARNING a ne BLOCK: Novo pravilo — treba validacijski period. Ako se pokaže da false positive rate je nizak, escalirat će se na exit 2 (BLOCK).
Lekcija: GOTCHA checklist je "razmisli prije kodiranja". /hop-build je "slijedi PROCES kodiranja". Jedno bez drugog = half-assed. Task #1309 dokazuje: razmišljanje bez procesa → shortcuti → broken output.
2026-02-04: Agenti moraju znati za sistem
Problem: Agenti kad zapnu pitaju umjesto da koriste problem-solving proces.
Root Cause: Agentima nisam davao informaciju O sistemu — samo task. Ne znaju da /tmp/claude-problem-solving.md postoji.
Fix: Kreiran ~/system/agents/BOOTSTRAP.md — svaki agent prompt počinje sa "Pročitaj BOOTSTRAP.md".
Lekcija: Agent bez konteksta o sistemu će raditi ad-hoc. Mora znati KAKO rješavamo probleme, ne samo ŠTA treba uraditi.
Lesson Learned: PI Orchestrator Task Routing Failures
Date: 2026-03-11 Context: World-Class Gap Analysis — 13 parallel tasks Impact: 4+ hours delay, 3 rounds of manual re-dispatching
Root Causes Found
1. delegate_task → Event Bus drops tasks silently
- Dispatched 13 tasks via delegate_task, only 4-6 arrived as MC tasks
- No error returned — delegate_task says "Event emitted" but no guarantee of delivery
- Fix needed: Event bus must ACK with MC task ID, or delegate_task must verify creation
2. Owner mismatch: delegate_task assigns to "pi-orchestrator" but orchestrator queries --owner john
pi-orchestrator.jsline 1087:next-task --owner john- delegate_task creates tasks with owner = "pi-orchestrator"
- Result: tasks invisible to orchestrator
- Fix needed: Either delegate_task should set owner=john, OR orchestrator should query both owners
3. mc.js start puts tasks in "in_progress" — orchestrator only picks up "open"
- When manually starting tasks with
mc.js start, status becomes "in_progress" next-taskonly returns "open" status tasks- Result: manually started tasks never get picked up
- Fix needed: Orchestrator should also consider "in_progress" tasks that have no active worker, OR document that mc.js start should NOT be used for orchestrator-managed tasks
4. Classifier sends research tasks to human-queue (complexity=5)
- Gap analysis research tasks classified as complexity=5 → auto-routed to human-queue
- These are research/analysis, not architecture decisions — complexity=4 is appropriate
- Fix needed: Classifier prompt should distinguish "deep research" from "architecture decision requiring human"
5. Classifier sends tasks to qwen3:8b which fails on complex analysis
- Some tasks misclassified as complexity=1/devops → qwen3:8b on forge → fails
- Fix needed: Minimum complexity floor for H-priority tasks (never < 3)
Correct Workflow (Until Fixed)
- Create tasks directly with
mc.js add "title" --priority H --owner john - Do NOT use
mc.js start— let orchestrator pick them up - Do NOT rely on delegate_task for batch dispatching — verify MC task creation
- After delegate_task, always check
mc.js list --owner john --status opento confirm
Systemic Fix Required
- Event bus delivery guarantee (at-least-once with ACK)
- Owner alignment: delegate_task → owner=john
- Classifier: H-priority → minimum complexity=3
- Classifier: "research/analysis" domain → never human-queue
- mc.js: add
reopencommand to reset in_progress → open
CI/CD & Production Monitoring (2026-03-12)
Incident: getdrop.no served drop-app instead of landing page for 7 days. No one noticed except CEO.
Root Cause
- AWS App Runner silently claimed getdrop.no as custom domain during a deploy session
- No automated check verifies "what content does our domain actually serve?"
- No uptime/content monitoring on any production URL
- CEO is the monitoring system — not scalable
Lessons
- Every production URL must have a smoke test — not just health check, but CONTENT verification (expected title, expected response body)
- Domain ownership must be explicit and audited — document which service owns which domain. Alert on any change.
- Deploy pipelines must verify the DESTINATION, not just the build — ZAKON #10 says "verify on destination" but we only verify locally
- CI must GATE deploy — deploy should require CI pass. Currently deploy is independent of CI.
- Infrastructure changes (DNS, custom domains, TF apply) must go through PR review — never ad-hoc CLI commands
- One fix for ALL products, not per-product — every fix must be systemic, applied to Drop AND Tok AND Bilko AND Lobby AND Plock AND BasicFakta
Required Actions (systemic, all products)
- Uptime monitoring for ALL production URLs (UptimeRobot/Checkly)
- Smoke test cron: verify content, not just HTTP 200
- Deploy gate: CI pass required before deploy
- Post-deploy verification: health + content + screenshot
- Domain audit: document service→domain mapping, alert on changes
- Terraform plan in PR (never ad-hoc apply)
2026-04-08: Testing Failure — Agents Write Tests That Cannot Fail (Drop)
Analysis by: Petter Graff + James Whittaker framework
Context: 10+ consecutive CEO test failures. CEO found bugs in 5 minutes that 1232 E2E tests missed.
Full analysis: ~/system/rules/lessons-learned-testing-2026-04-08.md
Root Cause
Test agents design tests to PASS, not to FIND BUGS. This is a design philosophy failure, not a quantity failure.
Measured failures in Drop E2E suite:
- 48 instances of
test.skip()— tests that lie about coverage - 100+ instances of
.catch(() => false)— failures silently swallowed - 10 of 29 pages tested for basic load (19 pages never visited)
- 0% of tests use UI navigation (all use
page.goto()direct URLs) - All BankID tests mock the response — testing the mock, not the app
- Tests accept multiple outcomes:
expect([200, 404])— 404 is accepted as "ok"
The 5 Behavioral Differences (CEO vs Agent)
- CEO tests EXPECTATIONS. Agents test ASSERTIONS.
- CEO tests JOURNEYS. Agents test COMPONENTS.
- CEO tests WHAT EXISTS. Agents test WHAT THEY BUILT.
- CEO tests CURRENT STATE (full regression). Agents test WHAT CHANGED.
- CEO STOPS on ambiguity. Agents SKIP on ambiguity.
Anti-Patterns (BANNED in all E2E tests)
test.skip()without a linked issue.catch(() => false)or.catch(() => {})in test assertionsexpect([200, 404]).toContain(status)— accepting failure as successpage.goto()for navigation after initial load — must click UI elementspage.route()mocking in E2E tests — test the real app- Hardcoded page lists — must discover pages from
find src/app -name "page.tsx" .first()on ambiguous locators — be specific
Required Patterns
- One expected outcome per assertion (not multiple acceptable)
- Click UI elements, don't goto URLs
- Test against deployed URL, not localhost
- Every page in the app must be visited (crawl-and-verify after each deploy)
- Financial data must be numerically verified (not string presence)
- Tests must FAIL when things break, never SKIP
Whittaker's 7 Tours (run after every deploy)
- Guidebook Tour — follow the primary user path by clicking, not goto
- Money Tour — verify every number on every screen (fee, rate, total, recipient)
- Landmark Tour — navigate ONLY via visible UI elements
- Intellectual Tour — test hardest features with complex inputs
- FedEx Tour — follow data creation to completion and verify it matches
- Garbage Collector Tour — visit ALL pages, including least-used ones
- Bad Neighborhood Tour — re-test every previous CEO-found bug scenario
Solution Implemented
- James Whittaker agent:
~/system/agents/identities/james-whittaker.md - Drop E2E Whittaker tours:
/Users/makinja/ALAI/products/Drop/tests/e2e/whittaker-tours.spec.ts - 30 Scenario checklist:
/Users/makinja/ALAI/products/Drop/tests/DROP-30-SCENARIOS.md - Feedback rule:
~/.claude/projects/-Users-makinja/memory/feedback_testing_root_cause.md
Lekcija: 1232 tests that skip on failure are worse than 10 tests that actually fail when things break. The required shift: tests must be designed to FIND bugs, not to PASS.
2026-06-12 — Generalizable process fixes (SnowIT-SEO OAuth session) — apply to ALL projects/clients
Memo: ~/.claude/projects/-Users-makinja/memory/feedback_generalizable_corrections_2026-06-12.md.
- B — Verify subagent claims by live outcome, not their word. A subagent reported OAuth "FULLY ACTIVE" while prod had a silent credential mismatch (new client_id + old leftover secret). Never relay an agent's "done/works" to CEO without an independent live check. Put in every dispatch brief.
- C — Verify external-platform assumptions at the vendor's own source before architecting. A whole plan branch rested on a false "scopes are restricted -> $15-30k CASA" assumption; Google's own console showed Sensitive/non-sensitive. Confirm scope tiers / quotas / API existence / pricing from the vendor, not memory or a prior agent.
- D — Cloud tenant isolation (now a ~/CLAUDE.md guardrail). Per-client/product cloud resources go in that tenant's own project/account, never a shared default.
- E — Validate the credential PAIR when changing an ID/secret. Probe the provider token endpoint with a dummy code: invalid_grant = valid pair; invalid_client = mismatch. Do before declaring an integration live; include in deploy-brief evidence.
No comments to display
No comments to display