Skip to main content

Bilko Sentinel — Tier-0 Self-Healing Agent 2026-06-10

Status

LIVE and Proveo-verified as of 2026-06-10. MC #103337 (AgentForge implementation) + MC #103337 Proveo independent verification. Parent MC #103328.

Related: Bilko Observability (GCP-native) 2026-06-10 — the GCP alert layer this agent reads from.

What It Is

Bilko Sentinel is a read-only ops agent that runs on ANVIL every 3 minutes. It follows a four-stage pipeline:

  1. Detect — queries the 8 GCP Cloud Monitoring alert policy conditions via the Monitoring REST API (GET only). Evaluates the last 6 minutes of time-series data locally against each condition's threshold.
  2. Enrich — on a breach, fetches recent Cloud Run logs and the current revision/traffic split for the affected service.
  3. Diagnose — calls FORGE Ollama (qwen2.5:7b-instruct-q8_0 at 10.0.0.2:11434) with a structured JSON prompt (temperature 0.1) to produce a root-cause hypothesis and recommended action. Falls back to a deterministic template per cause category if Ollama is unreachable.
  4. Propose — posts exactly one structured proposal per unique incident to Slack #ceo and email [email protected]. Deduplicates by incident key; does not re-notify the same breach for 24 hours.

It never changes anything. Proveo independently verified: zero mutating verbs, no GCP mutations of any kind (no run deploy, no set-iam-policy, no SQL writes, no secrets writes). The only HTTP POST in the script goes to the Ollama local inference endpoint, not to googleapis.com.

Infrastructure

ComponentLocation
Script/Users/makinja/system/tools/bilko-sentinel.js
LaunchAgent plist/Users/makinja/Library/LaunchAgents/com.alai.bilko-sentinel.plist
State file (dedup)/Users/makinja/system/state/bilko-sentinel-state.json
Audit log/Users/makinja/system/logs/bilko-sentinel-audit.jsonl
Run log/Users/makinja/system/logs/bilko-sentinel.log
HostANVIL (makinja local Mac)
Schedule180-second interval, RunAtLoad=true
Node.js path/opt/homebrew/bin/node

Policies Monitored (8 policies, 10 conditions)

  1. Cloud SQL CPU utilization high (prod + stage)
  2. Container restart/crash on prod services
  3. HTTP 5xx rate high on bilko-api-demo
  4. HTTP 5xx rate high on bilko-web-demo
  5. Request latency P95 high on prod services (API + Web — 2 conditions)
  6. CIAM — High 429 rate on bilko-api-demo (legacy from MC #103245)
  7. Cloud SQL connections near max on bilko-demo-db
  8. Uptime check failed (app.bilko.cloud + app-api.bilko.cloud — 2 conditions)

Severity Scale

LabelMeaning
P1-DOWNService is down or uptime check failing
P2-DEGRADEDElevated error rate or restart loop
P3-WARNLatency spike, DB pressure, CIAM abuse rate

Notification Format

Every proposal contains:

  • Header: BILKO SENTINEL — PROPOSAL (Tier-0, no action taken)
  • Incident ID, severity, env, resource, condition name
  • Metric value vs threshold (exact numbers)
  • Root-cause hypothesis (Ollama-generated or deterministic fallback)
  • Proposed remediation steps (for human to execute)
  • GCP Console link for the alert incident
  • Detected timestamp

Dedup key format: bilko-{policyId[-8:]}-{condId[-8:]}. Once notified, silent for 24 hours on the same condition.

Proveo Verification Summary

Proveo (MC #103337) independently verified all critical properties:

PropertyMethodResult
Read-only guaranteeExhaustive grep of all spawnSync calls and HTTP methodsCONFIRMED — zero mutating verbs
LaunchAgent loaded + healthylaunchctl list | grep bilko-sentinel — LastExitStatus=0PASS
Detect → Propose → Slack deliveryIndependent verifier script with synthetic threshold (2ms vs real 9.5ms P95)PASS — Slack message confirmed in #ceo at 04:24 UTC
Detect → Propose → Email deliverySame synthetic testPASS — Message-ID confirmed in audit DB
Dedup across cyclesReal 2-cycle disk-persistence test (not code inspection only)PASS — Cycle 2 silent, no second Slack message
Healthy = silentNormal threshold against real metric valuePASS — zero messages sent
No GCP mutationCloud Run revision before/after comparisonPASS — bilko-api-demo-00167-h9v unchanged

Honest gaps noted by AgentForge (now closed by Proveo): email exit-code quirk (fixed in script via stdout check); dedup 2-cycle test (now independently proven); Ollama not re-exercised in Proveo test (builder's synthtest confirmed it live).

Runbook

Pause sentinel

launchctl unload ~/Library/LaunchAgents/com.alai.bilko-sentinel.plist

Resume sentinel

launchctl load ~/Library/LaunchAgents/com.alai.bilko-sentinel.plist

Check last run status

launchctl list | grep bilko-sentinel
# PID="-" = not currently running (between intervals). LastExitStatus=0 = healthy.

tail -20 /Users/makinja/system/logs/bilko-sentinel.log

View audit trail

tail -f /Users/makinja/system/logs/bilko-sentinel-audit.jsonl

Tune alert thresholds

Edit the ALERT_POLICIES array in /Users/makinja/system/tools/bilko-sentinel.js, then reload the agent:

launchctl unload ~/Library/LaunchAgents/com.alai.bilko-sentinel.plist
# edit the script
launchctl load ~/Library/LaunchAgents/com.alai.bilko-sentinel.plist

Tier Model and Safety Rationale

The tier model was defined after the 2026-06 IAM incident, in which an automated set-iam-policy call wiped project IAM. The lesson: any agent that can mutate production infra must earn trust via a demonstrated read-only track record first.

TierCapabilityStatusSafety gates
Tier 0 — current Detect + Diagnose + Propose. Read-only. Posts structured proposal to #ceo and [email protected]. Zero blast radius. LIVE No code path to write to GCP. Proveo-verified.
Tier 1 — future MC Bounded auto-remediation: Cloud Run revision rollback, instance scale adjustment, hung service restart. Circuit breaker (max N actions/hour). Full audit trail. Never touches DB schema, IAM, secrets, or financial data. Always announces before acting. NOT BUILT — separate MC required Explicit CEO approval token (/tmp/bilko-sentinel-tier1-approved) required before any mutation. Separate script (bilko-sentinel-tier1.js). Only after Tier-0 proves signal quality over weeks.
Tier 2 Broader autonomy. Probably never for a prod-financial SaaS N/A

The IAM incident reference is intentional: Tier-1 will be built with a hard whitelist of reversible Cloud Run and scaling operations only. No set-iam-policy, no SQL DDL, no secret rotation — ever.