Operations Guide Last Verified: 2026-02-17 | Owner: John Operations Manual Version: 1.0 Last Updated: 2026-01-28 Owner: Alem Basic Prepared by: John (Director) + Nermin Šabić (DevOps) + Amina Hadžić (Head of Projects) Executive Summary This is the operational playbook. Daily/weekly/monthly routines, KPIs, reporting structure, tools, systems, and disaster recovery. Everything needed to run the organization day-to-day. Purpose: Ensure operations run smoothly whether Alem is available or not. John + team can operate independently using this manual. 1. Daily Operations 1.1 Daily Routine — John (Director) Every Morning (Oslo Time Zone): 08:00 — Wake up (boot) ├─ Read MEMORY.md (context refresh) ├─ Check john.db for overnight tasks ├─ Check Telegram for Alem messages └─ Check email (john@basicconsulting.no) for urgent matters 08:30 — Prepare morning brief ├─ Tasks completed yesterday ├─ Planned for today ├─ Blockers ├─ Metrics snapshot (revenue, uptime, trading) └─ Send to Alem via Telegram (if significant updates) 09:00 — Monitor task queue ├─ ~/clawd/tasks/pending/ → assign to agents ├─ ~/clawd/tasks/in-progress/ → check progress └─ ~/clawd/tasks/completed/ → archive 09:15 — Daily standup (all team, 15 min) 09:30-18:00 — Execution mode ├─ Delegate tasks to agents ├─ Monitor progress ├─ Escalate blockers within 4h ├─ Respond to Telegram within 5 min ├─ Check trading positions (every 3 hours: 12:00, 15:00, 18:00) └─ Log all decisions to john.db 18:00 — Evening wrap-up ├─ Archive completed tasks ├─ Sync DB to GitHub (automatic, verify) ├─ Prepare tomorrow's priorities └─ Send summary to Alem (if needed) 24/7 Monitoring: Trading: Every 3 hours (cron job) Infrastructure: Continuous (Datadog + PagerDuty) Task queue: Every 30 seconds (john-daemon.sh) Telegram: Within 5 minutes 1.2 Daily Routine — Team 9:15 AM CET — Daily Standup (Mon-Fri, 15 min max) Attendees: All team (Emir leads) Format: Each person: 3 questions, 1 min each What did I do yesterday? What will I do today? Any blockers? Emir's checklist: Update sprint board before standup Note blockers → escalate after standup Update burn-down chart Throughout Day: Agents execute assigned tasks Update Jira/Linear as tasks progress Escalate blockers within 1 hour Code review within 24 hours Respond to Slack/messages within 4 hours (business hours) End of Day: Update task status Log any decisions or learnings Prepare for tomorrow's standup 2. Weekly Operations 2.1 Weekly Cadence Day Time Event Owner Duration Monday 9:15 AM Daily standup Emir 15 min 10:00 AM Sprint planning (every 2 weeks) Emir + Amina 2-3 hours Tuesday 9:15 AM Daily standup Emir 15 min Wednesday 9:15 AM Daily standup Emir 15 min 14:00 CET Backlog refinement Emir + Lejla + Selma 1 hour Thursday 9:15 AM Daily standup Emir 15 min 14:00 CET Architecture review (bi-weekly) Lejla 1-2 hours Friday 9:15 AM Daily standup Emir 15 min 15:00 CET Sprint review + retro (every 2 weeks) Emir + team 2 hours 16:00 CET Weekly wrap-up (John → Alem) John 30 min 2.2 Weekly Reporting (John → Alem) Every Friday, 16:00 CET: Weekly Summary (Telegram or Email): # Week of [Date] ## Shipped This Week - [Feature/bug 1] - [Feature/bug 2] ## Metrics - Revenue: €X (+/- Y% from last week) - Customers: X (+/- Y new/churn) - Uptime: 99.X% - Trading ROI: +/-X% ## Blockers / Risks - [Blocker 1 — mitigation plan] - [Risk 1 — action taken] ## Next Week Priorities - [Priority 1] - [Priority 2] ## Decisions Made (>€500) - [Decision 1 — €X — rationale] Alem reviews and provides feedback. 3. Monthly Operations 3.1 Monthly Business Review (MBR) Last Friday of Every Month, 14:00-16:00 CET Attendees: Alem, John, Amina Agenda: Revenue & Growth (10 min) MRR, new customers, churn, LTV/CAC Forecast: next 3 months Product & Development (10 min) Features shipped this month Sprint velocity trend Tech debt status Operations (10 min) Uptime, incidents, MTTR Support metrics (tickets, CSAT) Trading (5 min) Monthly P&L ROI, Sharpe ratio Portfolio allocation Risks & Compliance (10 min) Top 5 risks, status Compliance updates (HIPAA, audits) Team & People (5 min) Agent utilization Process improvements Hiring needs (if any) Next Month Priorities (10 min) What are we focusing on? Resource allocation Output: Updated priorities for next month Budget adjustments (if needed) Action items 3.2 Monthly Reporting — Financial By 5th of Each Month: John prepares financial report: Metric Current Month Last Month Change Revenue (Fast Constructions) €X €Y +/- Z% Expenses (total) €X €Y +/- Z% ├─ Infrastructure €X €Y +/- Z% ├─ SaaS tools €X €Y +/- Z% ├─ Marketing €X €Y +/- Z% ├─ Development (SnowIT payment) €X €Y +/- Z% ├─ Professional services €X €Y +/- Z% ├─ Insurance €X €Y +/- Z% Net Profit €X €Y +/- Z% Charity (50%) €X (accrued) — — Burn Rate €X/month — — Runway X months — — Sent to: Alem + accountant (if hired) 3.3 Monthly Tasks Checklist John's Monthly Checklist: Financial report prepared (by 5th) Monthly Business Review held (last Friday) Risk register reviewed (Dženan) Compliance checklist reviewed (Dženan) Infrastructure cost optimization (Nermin) Trading performance report (Nick) Customer feedback analysis (Selma) Sprint velocity analysis (Emir) Tech debt review (Lejla + Tarik) Backup verification (Nermin — test restore) Security scan (Tarik — OWASP ZAP) Vendor BAA review (Dženan) Database cleanup (old logs, expired data) Update MEMORY.md (add learnings from month) 4. Quarterly Operations 4.1 Quarterly Planning Last Week of Quarter (March, June, September, December) Attendees: Alem, John, Amina, Lejla, Selma Agenda: Review last quarter (goals, achievements, misses) Market & competitive analysis (Selma) Product roadmap update (Amina + Lejla) Strategic priorities for next quarter (Alem) Budget allocation Hiring plan (if needed) Risk review (Dženan) Output: OKRs (Objectives & Key Results) for next quarter Budget approved Roadmap locked for next 3 months 4.2 Quarterly Tasks Tech Debt Sprint — one full sprint dedicated to refactoring, testing, documentation (Lejla) Compliance audit — internal HIPAA audit (Dženan + Tarik) Infrastructure review — cost, performance, scaling plan (Nermin) Customer satisfaction survey — NPS survey to all customers (Selma) Competitive analysis — review competitors, market trends (Selma) Patent progress review — if filing in progress (Dženan) Insurance review — renew or update policies (Dženan) Document review — update all org docs (MEMORY.md, ORGANIZATION.md, this doc) 5. Annual Operations 5.1 Annual Planning December (for next year) Agenda: Review full year (revenue, customers, product, team) Set annual vision and goals (Alem) Product roadmap (12 months) Budget (annual) Hiring plan Strategic initiatives (new products, markets, partnerships) Charity commitment (allocate 50% of profit) Output: Annual OKRs Annual budget 12-month roadmap 5.2 Annual Tasks Annual financial statements — P&L, balance sheet, cash flow (accountant) Tax filings — US (Fast Constructions), BiH (SnowIT), Norway (Alem personal) SOC 2 Type II audit — external audit (Dženan + auditor) HIPAA risk assessment — annual requirement (Dženan) Insurance renewal — cyber liability, E&O, general liability (Dženan) Charitable giving — donate 50% of net profit (Alem selects charities) Transparency report — publish charity donations on lumiscare.com/impact Team performance reviews — if real humans hired (Amina) Document archive — backup all docs, contracts, decisions to secure storage 6. Local Infrastructure (Mac Studio) 6.1 Services Overview All services run locally on Mac Studio M3 Ultra (96GB RAM). Zero cloud dependency for operations. Service Type Port Purpose Mattermost Docker 8065 Team chat (4 teams: basic, wizard, rendrom, riad) Planka Docker 3100 Kanban boards (boards.basicconsulting.no) Documenso Docker 3003 Document signing (sign.basicconsulting.no) BookStack Docker 6875 Wiki/documentation MC Dashboard Node.js 3030 Mission Control web UI (task management) Ollama Native 11434 Local AI (8b classify, 32b respond/code) External access: Cloudflare tunnels (mm/boards/sign.basicconsulting.no) 6.2 Daemons (LaunchAgents) Daemon Interval Purpose com.john.ops-agent 5 min Autonomous ops — MM monitoring, health checks, auto-fix, task creation, intelligent responses com.edita.autowork 30 min Background task worker (Claude haiku) com.john.mc-dashboard always Mission Control web dashboard com.john.mc-session-worker on events Session state extraction 6.3 Ops Agent (Autonomous Operations) Replaces manual MM monitoring. Runs 24/7, $0 cost (local Ollama AI). What it does: Reads all MM messages from 4 teams Classifies via Ollama 8b: ROUTINE / TASK / INCIDENT ROUTINE — logs, no action TASK — creates MC task (BILLABLE if client team) + Planka card + MM reply INCIDENT — HIGH priority task + escalation to John Runs health checks on all 20 services every cycle Auto-fixes known issues (restart, cleanup) with safety limits (max 3/hour) Runbook: ~/system/context/docs/runbooks/ops-agent.md 6.4 Monitoring Stack Tool What It Monitors Action health-check.js Docker (8), HTTP (6), system (2), daemons (4) Status report auto-fix.js Service failures Automated restart/cleanup (max 3/hour) ops-agent.js MM messages + health Classify, respond, create tasks, fix smoke-test.js Integration tests Pre/post deployment verification Dashboard: http://localhost:3030 (MC — tasks, stats, mobile-friendly) 7. Key Performance Indicators (KPIs) {#kpis} 6.1 KPI Dashboard (Live, Updated Daily) Owner: John (maintains dashboard) Tool: Notion, Grafana, or Google Sheets Metrics: Business Metrics Metric Current Target Trend Owner MRR (Monthly Recurring Revenue) €X 10%+ MoM growth ↗️ Selma Customer Count X 10 (Month 6), 50 (Month 12) ↗️ Selma Churn Rate X% < 5% monthly ↘️ Selma Customer Acquisition Cost (CAC) €X < €500 ↘️ Selma Lifetime Value (LTV) €X > €2,000 ↗️ Selma LTV/CAC Ratio X:1 > 3:1 ↗️ Selma Technical Metrics Metric Current Target Trend Owner Uptime 99.X% 99.9% (LumisCare) ↗️ Nermin API Latency (p95) Xms < 500ms ↘️ Nermin Page Load Time Xs < 2s ↘️ Frontend Deployment Frequency X/week Daily (staging), weekly (prod) ↗️ Nermin Mean Time to Recovery (MTTR) Xh < 4h ↘️ Nermin + Lejla Bug Escape Rate X% < 5% ↘️ Tarik Test Coverage X% ≥ 80% ↗️ Tarik Operational Metrics Metric Current Target Trend Owner Sprint Velocity X pts Consistent ±10% → Emir Task Completion Rate X% ≥ 95% ↗️ John Support Response Time Xmin < 30min (Tier 1) ↘️ Selma Customer Satisfaction (CSAT) X/5 ≥ 4.5/5 ↗️ Selma Agent Utilization X% ≥ 70% billable ↗️ Amina Trading Metrics Metric Current Target Trend Owner Monthly ROI X% ≥ 5% ↗️ Nick Sharpe Ratio X > 1.5 ↗️ Nick Portfolio Value €X €10,000 (current) ↗️ Nick Stop-Loss Adherence X% 100% → Nick 6.2 Monitoring & Alerting Tools: Datadog: Infrastructure, APM, logs PagerDuty: On-call rotation, incident alerts Stripe Dashboard: Revenue, subscriptions Binance API: Trading positions, P&L john.db: All decisions, tasks, logs Alert Configuration: Alert Threshold Action Owner Uptime < 99.9% Downtime detected PagerDuty → Nermin Nermin API latency > 1s (p95) Sustained 5 min Slack alert → Lejla Lejla Error rate > 1% Sustained 5 min PagerDuty → Nermin Nermin Database CPU > 80% Sustained 10 min Slack alert → Nermin Nermin Trading loss > 5% Single position Telegram → John → Nick Nick Customer churn > 10% Monthly Email → Selma, Amina Selma Support ticket SLA breach > 4h unresolved Slack alert → Selma Selma On-Call Rotation: Primary: Nermin (DevOps) Secondary: Lejla (Tech Lead) Escalation: John → Alem On-Call SLA: Acknowledge alert: 15 min Begin investigation: 30 min Escalate if not resolved: 2 hours (P2), 4 hours (P1) 7. Tools & Systems 7.1 Core Systems System Purpose Access Owner john.db (SQLite) Source of truth, all decisions logged John, Alem (query) John GitHub Code repository, version control All devs Lejla AWS Infrastructure (ECS, RDS, S3, CloudFront) Nermin, Lejla Nermin Stripe Payment processing, subscriptions Selma, Amina, John Selma Binance Trading Nick, John Nick Jira / Linear Task tracking, sprint management All team Emir Datadog Monitoring, APM, logs Nermin, Lejla, John Nermin PagerDuty On-call, incident alerts Nermin, Lejla Nermin Intercom / Crisp Customer support chat Selma, Tarik Selma Telegram (@johnbasicas_bot) Quick communication (Alem ↔ John) Alem, John John 7.2 Tool Stack (Detailed) Development Tool Purpose Cost Owner GitHub Code repo, PRs, CI/CD Free (public repos) Lejla GitHub Actions CI/CD pipelines Included Nermin VSCode / Cursor IDE Free All devs Playwright E2E testing Free Tarik Jest Unit testing Free Tarik ESLint / Prettier Linting, formatting Free Lejla Infrastructure Tool Purpose Cost Owner AWS ECS/EKS Container orchestration ~€500-2,000/mo Nermin AWS RDS (PostgreSQL) Database ~€100-500/mo Nermin AWS S3 File storage ~€50-200/mo Nermin CloudFront CDN ~€50-200/mo Nermin Route53 DNS ~€10/mo Nermin Datadog Monitoring, APM ~€100-500/mo Nermin PagerDuty On-call ~€50-100/mo Nermin Business & Operations Tool Purpose Cost Owner Stripe Payments 2.9% + €0.30/transaction Selma Intercom / Crisp Support chat ~€50-200/mo Selma Jira / Linear Project management ~€50-100/mo Emir Notion Documentation, wiki Free tier or ~€10/mo John Apollo.io Sales outreach ~€100/mo Selma LinkedIn Sales Navigator Sales prospecting ~€80/mo Selma Communication Tool Purpose Cost Owner Telegram Alem ↔ John quick chat Free John Email (one.com) External communication ~€10/mo John Slack (future) Team collaboration Free tier or ~€50/mo John Total Estimated Tool Cost: €1,000-3,000/month (scales with usage) 8. Disaster Recovery & Business Continuity 8.1 Backup Strategy What We Back Up: Data Backup Method Frequency Retention Location john.db (SQLite) GitHub sync Hourly Indefinitely github.com/johnatbasicas/clawd Source code GitHub Every commit Indefinitely github.com/johnatbasicas Database (PostgreSQL) AWS RDS automated backups Daily 30 days AWS S3 File storage (S3) Cross-region replication Real-time 90 days AWS S3 (us-west-2) Audit logs S3 + Glacier Daily 6 years (HIPAA) AWS Glacier Configuration files GitHub Every change Indefinitely GitHub (encrypted secrets) Backup Verification: Nermin tests restore monthly. 8.2 Disaster Scenarios & Response Scenario 1: AWS Region Failure Impact: LumisCare production down Response: Nermin deploys to secondary region (us-west-2) — automated failover DNS updated to point to secondary (Route53 health checks) Restore database from latest backup Verify functionality Notify customers (status page) RTO (Recovery Time Objective): 2 hours RPO (Recovery Point Objective): 24 hours (last daily backup) Scenario 2: Data Breach (HIPAA) Impact: PHI exposed Response: Dženan activates breach response plan (see GOVERNANCE.md, section 8.3) Contain breach (Nermin) Assess impact (Lejla + Dženan) Notify customers (within 60 days per HIPAA) Notify HHS (if >500 individuals) Remediate + post-mortem Timeline: Immediate containment, notification within 60 days Scenario 3: Key Person Unavailable (Alem, John, Lejla, Nermin) Alem unavailable: John continues operations (delegated authority) Strategic decisions deferred or escalated to Asmir (SnowIT partner) If prolonged: Appoint interim CEO or sell John unavailable: Task queue still processed (daemon runs 24/7) Amina coordinates team Alem steps in for strategic decisions John can be "rebooted" from MEMORY.md + ORGANIZATION.md Lejla unavailable: API Developer + Frontend Specialist continue development Architecture decisions deferred or escalated to Amina → Alem Code reviews by 2 other senior devs (if available) Nermin unavailable: Infrastructure on auto-pilot (monitoring, auto-scaling) Lejla handles incidents (secondary on-call) Engage external DevOps contractor if prolonged Scenario 4: GitHub Outage Impact: Can't access code, can't deploy Response: Local copies of code on all dev machines Deploy from last known good state (Docker images cached) Wait for GitHub to restore (historically < 4 hours) RTO: 4 hours RPO: 0 (local copies) Scenario 5: Stripe Outage Impact: Can't process payments Response: Monitor Stripe status page Notify customers (if prolonged) Wait for Stripe to restore No immediate action (payments retry automatically) RTO: Dependent on Stripe RPO: 0 (Stripe handles retries) 8.3 Runbooks Location: ~/clawd/org/runbooks/ Runbooks Maintained by Nermin: runbook-db-restore.md — How to restore PostgreSQL from backup runbook-deploy-rollback.md — How to rollback production deployment runbook-region-failover.md — How to failover to secondary AWS region runbook-security-incident.md — How to respond to security breach runbook-scaling.md — How to manually scale infrastructure runbook-certificate-renewal.md — How to renew TLS certificates Each runbook includes: When to use it Step-by-step instructions Commands to run Expected output Rollback procedure Contact information (who to escalate to) Review Cadence: Quarterly (test at least one runbook per quarter) 9. Communication Channels & Etiquette See PROCESSES.md, Section 4 for full communication protocols. Quick Reference: Channel Use Case Response SLA Telegram Urgent, quick decisions 5-15 min CLI (Claude Code) Deep work, architecture Real-time Email External, formal 4-24 hours Slack (future) Team collaboration 1-4 hours Jira/Linear Task tracking Daily check GitHub Code, PRs 24 hours Standup Daily status 9:15 AM CET 10. Document Control Version Date Changes Author 1.0 2026-01-28 Initial document John + Nermin + Amina Next Review: 2026-04-01 (quarterly) Owner: Alem Basic Maintained By: John (Director) + Nermin Šabić (DevOps) + Amina Hadžić (Head of Projects) End of Operations Manual Run the organization like a machine. Predictable, reliable, scalable. Daily, weekly, monthly, quarterly routines. KPIs tracked. Backups verified. Disasters planned for. Just execute.