AAOS — ALAI Agent Operating System
Executive Summary
AAOS is the enforcement runtime for the ALAI agent system. It turns optional protocols (RAG-first, GOTCHA, evidence tracking, quality gates) into mandatory runtime gates that every agent passes through on every lifecycle transition.
Core insight: Enforcement belongs at state transitions, not at every tool call. Per-tool-call enforcement caused 348 blocks/session (system unusable). AAOS uses 4 gates at 4 transitions — proven workable.
Spec file: ~/system/specs/aaos-architecture.md
Deployed: 2026-04-02
MC Task: #6921
Architecture Layers
Layer 5: INTERFACE — John (Orchestrator) | MC Dashboard | Slack | CLI
Layer 4: ORCHESTRATION — pi-orchestrator.js | team-coordinator.js | pipeline-engine.js
Layer 3: ENFORCEMENT — Spawn Gate | Exec Gate | Claim Gate | Close Gate
Layer 2: LIBRARY — Tool Registry | Skill Registry | RAG Index | Agent Registry | Context Assembler
Layer 1: COMPUTE — Ollama ANVIL (12 models) | Ollama FORGE (7 models) | Claude API | Local Tools
Layer 0: PERSISTENCE — SQLite (54 DBs) | Filesystem | HiveMind | Qdrant (vector search)
The 4 Enforcement Gates
| Gate | When | Checks | Implementation |
|---|---|---|---|
| SPAWN GATE | Agent creation | MC task exists & in_progress, GOTCHA written (H/M), team composition meets minimum, budget check | kernel/spawn-gate.js + pi-orchestrator Step 4.5 |
| EXEC GATE | During execution | WIP limit (max 3), tool whitelist, budget cap, timeout | Existing hooks (alai-hooks binary) |
| CLAIM GATE | Before "done" | All claims labeled L0-L4, no L0/L1 in final report, evidence artifacts exist | kernel/claim-gate.js |
| CLOSE GATE | Task completion | QA-19 score meets threshold, metrics recorded to agent_metrics, learning posted to HiveMind | mc.js done handler |
Trust Levels (ZAKON #21)
| Level | Meaning | Allowed |
|---|---|---|
| L0 | Unverified — agent says "done" with no evidence | ❌ Never to CEO |
| L1 | Self-Tested — agent ran its own tests | ❌ Never to CEO |
| L2 | Peer-Tested — validator or tester confirmed | ✅ Minimum for reports |
| L3 | Machine-Verified — exit codes, HTTP responses, DOM checks | ✅ Required for aggregate claims |
| L4 | Human-Verified — Alem confirmed | ✅ Gold standard |
Library-in-the-Middle
The Library is a Node.js module (kernel/library.js) that unifies access to all existing stores. Agents don't browse ~/system/ looking for files — they call the Context Assembler which returns exactly what they need, within a token budget.
API
const library = require('~/system/kernel/library.js');
// Assemble full context for an agent on a task
library.assemble(taskId, agentId)
→ { coreProtocol, agentPersona, projectContext, ragContext, skillSet, toolWhitelist, rules, tokenBudget }
// Individual registries
library.tools.search(query) // Search 1310 tools
library.tools.audit(toolName, agentId, taskId) // Record usage
library.skills.forAgent(agentId) // Cookbook-matched skills
library.context.rag(query, limit) // HiveMind semantic search
library.agents.roster(taskType, priority) // Recommended team composition
library.rules.forTask(taskType) // Relevant ZAKONs
Token Budgets
| Model | Max Context Tokens |
|---|---|
| Claude Opus | 32,000 |
| Claude Sonnet | 16,000 |
| Claude Haiku | 4,000 |
| Ollama 32B | 8,000 |
| Ollama 8B | 4,000 |
Team Composition Rules
Config: ~/system/config/team-templates.json
| Task Type | Min Team | Required Roles |
|---|---|---|
| Trivial fix | 1 | Builder only |
| Feature (M priority) | 3 | Builder + Validator + Tester |
| Feature (H priority) | 5 | Builder + Validator + 2 Testers + Security |
| Architecture | 3 | Architect + Devil's Advocate + Validator |
| Deploy | 3 | Builder + DevOps + Validator |
| Financial | 3 | Builder + Finance + Validator |
Specialist Agents
22 agents total in specialist-mapping.json. Key additions (2026-04-02):
Builders (Write/Edit access)
| Agent | Company | Domain | Expertise |
|---|---|---|---|
| Hadi Hariri | CodeCraft | Kotlin/Ktor | Kotlin, Ktor, coroutines, Gradle, JVM optimization |
| Lee Robinson | CodeCraft | Next.js 15 | App Router, React Server Components, Tailwind, Vercel |
Testers (READ-ONLY — no Write/Edit)
| Agent | Company | Focus | Style |
|---|---|---|---|
| Angie Jones | Proveo | Test automation | Frameworks, E2E, API contracts, regression |
| James Bach | Proveo | Exploratory testing | Skeptical, edge cases, "what would a real user do?" |
| Lisa Crispin | Proveo | Agile testing | Business rules, acceptance criteria, Given/When/Then |
| Dorota Huizinga | Proveo | Performance testing | Load testing, chaos engineering, p50/p95/p99 latencies |
Tester Assignment Rule
- H-priority: All 4 testers (minimum 3)
- M-priority: Angie Jones + 1 other (minimum 2)
- L-priority: Angie Jones (minimum 1)
Database Schema (New Tables)
All in ~/system/databases/mission-control.db
agent_metrics
CREATE TABLE agent_metrics (
id INTEGER PRIMARY KEY AUTOINCREMENT,
agent_id TEXT NOT NULL, -- e.g., 'bruce-momjian'
task_id INTEGER, -- MC task ID
qa_score REAL, -- QA-19 score (0-19)
token_count INTEGER, -- tokens consumed
duration_seconds INTEGER, -- wall clock time
escalated BOOLEAN DEFAULT 0, -- task escalated to higher model?
model_used TEXT, -- e.g., 'sonnet', 'qwen3:32b'
claim_count INTEGER DEFAULT 0,
evidence_count INTEGER DEFAULT 0,
defects_found INTEGER DEFAULT 0,
trust_level TEXT DEFAULT 'L0', -- L0-L4
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
team_composition
CREATE TABLE team_composition (
id INTEGER PRIMARY KEY AUTOINCREMENT,
task_id INTEGER NOT NULL,
role TEXT NOT NULL, -- builder, validator, tester, security
agent_id TEXT NOT NULL,
assigned_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
library_usage
CREATE TABLE library_usage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
task_id INTEGER,
agent_id TEXT,
tool_name TEXT,
skill_name TEXT,
used_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Pi-Orchestrator Integration
Wired 2026-04-02. Backup: pi-orchestrator.js.bak-aaos-20260402
- Imports (line 66-72):
library.js+spawn-gate.jswith graceful degradation - Spawn Gate (Step 4.5, line 3288): Advisory check before task claim — logs warning if gate fails, doesn't block pi-orch
- Library Context (line 770-782): RAG preloading via
library.assemble()injected intobuildPrompt() - Prompt Template (line 928):
aaosContextBlockadded between contextBlock and projectContextBlock
Graceful degradation: If AAOS modules fail to load, pi-orchestrator works exactly as before.
Infrastructure Status
| Component | Status | Details |
|---|---|---|
| Docker | ✅ UP | v29.2 |
| Qdrant | ✅ UP | 3 collections (sessions, knowledge, hivemind) on port 6333 |
| Ollama ANVIL | ✅ UP | 12 models on localhost:11434 |
| Ollama FORGE | ✅ UP | 7 models on 10.0.0.2:11434 |
| Tool Shed | ✅ UP | 240 tools on port 3050 |
| HiveMind | ✅ UP | 25,309 entries, keyword search working |
| Hooks Binary | ✅ UP | 15.7MB arm64, 4 blocking + 1 advisory gate |
Enforcement Configuration
File: ~/.claude/hooks/config/enforcement.json
| Hook | ZAKON | Mode |
|---|---|---|
| HopBuild | #5 | BLOCKING |
| RAG-First | #12 | BLOCKING |
| QA-19 | #14 | BLOCKING |
| Evidence | #21 | BLOCKING |
| Agent Testing | #20 | ADVISORY (promote to blocking after 2 weeks) |
File Map
New Files (created 2026-04-02)
~/system/kernel/library.js — Library-in-the-Middle (283 lines)
~/system/kernel/spawn-gate.js — SPAWN GATE enforcement
~/system/kernel/claim-gate.js — CLAIM GATE enforcement
~/system/config/team-templates.json — Team composition rules (6 types)
~/system/specs/aaos-architecture.md — Full architecture spec (1060 lines)
~/system/agents/definitions/hadi-hariri.md + .yaml — Kotlin/Ktor specialist
~/system/agents/definitions/lee-robinson.md + .yaml — Next.js 15 specialist
~/system/agents/definitions/james-bach.md + .yaml — Exploratory tester
~/system/agents/definitions/lisa-crispin.md + .yaml — Agile tester
~/system/agents/definitions/dorota-huizinga.md + .yaml — Performance tester
~/system/agents/identities/{hadi,lee,james,lisa,dorota}-*.md — Full identities
Modified Files
~/system/tools/mc.js — CLOSE GATE metrics recording in done handler
~/system/kernel/pi-orchestrator.js — AAOS wiring (spawn-gate + library context)
~/system/agents/specialist-mapping.json — 5 new agents (total: 22)
~/system/databases/mission-control.db — 3 new tables
Metrics & Learning Loop
Every task completion records to agent_metrics:
- Agent ID, task ID, model used
- Duration (seconds from mc.js start to done)
- QA-19 score (if available)
- Evidence count (files in
/tmp/evidence-{id}/) - Trust level (L0-L4, based on evidence presence and force flag)
Every non-forced completion also posts a learning entry to HiveMind (knowledge type).
Success Criteria
- Zero agents complete a task without RAG preloading (measured by SPAWN GATE rejection count)
- Zero L0/L1 claims reach Alem (measured by CLAIM GATE + CEO-reported false claims)
- Every H-priority task has 3+ testers (measured by team_composition table)
- Agent quality improves over time (measured by avg QA-19 score per agent, monthly)
- Token efficiency improves (measured by qa_score / token_count ratio, monthly)