Operational Runbook
Operational Runbook
Project: Drop Version: 0.1.0 Date: 2026-02-23 Author: Platform Architect (AI) Status: In Review Reviewers: Alem Bašić (CEO)
Document History
| Version | Date | Author | Changes |
|---|---|---|---|
| 0.1 | 2026-02-23 | Platform Architect (AI) | Initial draft covering day-to-day Drop operations |
1. Overview
This runbook covers day-to-day operations of Drop's production environment. Drop runs on AWS App Runner (eu-west-1) with RDS PostgreSQL.
Primary operations contact: Alem Bašić — [email protected] / +47 40 47 42 51
AI Operations: John (AI Director) — Slack #drop-alerts
2. Quick Reference
Production Infrastructure
| Component | Identifier |
|---|---|
| App Runner service | arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec |
| App Runner URL | https://9ef3szvvsb.eu-west-1.awsapprunner.com |
| RDS instance | drop-db |
| RDS endpoint | drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com:5432 |
| ECR repository | 324480209768.dkr.ecr.eu-west-1.amazonaws.com/drop-web |
| Staging | https://drop-staging.fly.dev |
| Status page | https://drop-status.betteruptime.com |
| Slack alerts | #drop-ops on alai-talk.slack.com |
Quick Health Check
# Application health (production)
curl -s https://getdrop.no/api/health | jq
# App Runner status
aws apprunner describe-service \
--service-arn arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec \
--query 'Service.Status' --output text --region eu-west-1
# RDS status
aws rds describe-db-instances \
--db-instance-identifier drop-db \
--query 'DBInstances[0].DBInstanceStatus' --output text --region eu-west-1
# Live App Runner logs
aws logs tail /aws/apprunner/drop-web/8e45b0d335304487a1880f4e32d6aeec/application \
--follow --region eu-west-1
3. Routine Operations
3.1 Daily Checks
- BetterStack: all 3 monitors green (health, landing, US east)
- Slack
#drop-ops: no unresolved critical alerts from last 24h - App Runner service status:
RUNNING - RDS snapshot from last night: exists and < 24h old
# Verify last RDS snapshot
aws rds describe-db-snapshots \
--db-instance-identifier drop-db --region eu-west-1 \
--query 'DBSnapshots[?SnapshotType==`automated`]|sort_by(@,&SnapshotCreateTime)[-1].{id:DBSnapshotIdentifier,time:SnapshotCreateTime}' \
--output table
3.2 Weekly Checks
- Review CloudWatch logs for recurring error patterns
- Check RDS free storage space (alert if < 2GB)
- Review AML alerts table for any open cases
- Review pending KYC applicants (stuck in
pendingstatus > 24h) - Check ECR — clean up untagged images manually if lifecycle policy hasn't run
# Check RDS storage
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name FreeStorageSpace \
--dimensions Name=DBInstanceIdentifier,Value=drop-db \
--start-time $(date -u -d '1 hour ago' --iso-8601=seconds) \
--end-time $(date -u --iso-8601=seconds) \
--period 3600 \
--statistics Average \
--region eu-west-1
# Check pending KYC (connect to RDS first via bastion or VPN)
psql -h drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com -U dropuser -d dropapp \
-c "SELECT id, email, kyc_status, created_at FROM users WHERE kyc_status = 'pending' ORDER BY created_at ASC;"
3.3 Monthly Checks
- Review SLA report (uptime, error rate, p99 latency)
- Test BetterStack alerts (pause monitor → verify escalation fires → resume)
- Verify RDS snapshot restore works (restore to temp instance, verify data, delete)
- Review secret rotation schedule — anything due?
- Review STR reports table — any pending filings?
4. Deployment Procedure
4.1 Standard Deployment (App Runner)
# 1. Ensure all CI checks pass on main branch
# 2. Build and push new Docker image to ECR
docker build -t drop-app .
docker tag drop-app:latest 324480209768.dkr.ecr.eu-west-1.amazonaws.com/drop-web:$(git rev-parse --short HEAD)
aws ecr get-login-password --region eu-west-1 | \
docker login --username AWS --password-stdin 324480209768.dkr.ecr.eu-west-1.amazonaws.com
docker push 324480209768.dkr.ecr.eu-west-1.amazonaws.com/drop-web:$(git rev-parse --short HEAD)
# 3. Create pre-deployment RDS snapshot
aws rds create-db-snapshot \
--db-instance-identifier drop-db \
--db-snapshot-identifier drop-db-pre-deploy-$(date +%Y%m%d-%H%M) \
--region eu-west-1
# 4. Create BetterStack maintenance window (prevents false alerts)
# Go to BetterStack → Maintenance Windows → Create Window (30 min)
# 5. Trigger App Runner deployment
aws apprunner start-deployment \
--service-arn arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec \
--region eu-west-1
# 6. Monitor deployment status
aws apprunner describe-service \
--service-arn arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec \
--query 'Service.Status' --output text --region eu-west-1
# Wait for RUNNING
# 7. Verify health
curl -s https://getdrop.no/api/health | jq
# 8. Close BetterStack maintenance window
Typical deployment time: 3–5 minutes
4.2 Staging Deployment (Fly.io)
# Deploy to Fly.io staging
cd src/drop-app
fly deploy --app drop-staging
# Verify staging health
curl -s https://drop-staging.fly.dev/api/health | jq
4.3 Emergency Rollback
# Identify previous ECR image
aws ecr describe-images --repository-name drop-web --region eu-west-1 \
--query 'sort_by(imageDetails,&imagePushedAt)[-2].imageDigest' --output text
# Update App Runner to use previous image tag via console,
# then trigger deployment:
aws apprunner start-deployment \
--service-arn arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec \
--region eu-west-1
5. Secret Rotation
5.1 Rotate JWT_SECRET
Impact: All active user sessions immediately invalidated. All logged-in users are logged out.
# 1. Generate new secret
NEW_SECRET=$(openssl rand -base64 48)
# 2. Update in AWS Secrets Manager
aws secretsmanager update-secret \
--secret-id drop/production/jwt-secret \
--secret-string "$NEW_SECRET" \
--region eu-west-1
# 3. Update App Runner environment variable (via console or CLI)
# Then trigger new deployment
# 4. Log rotation in audit_log
psql -h drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com -U dropuser -d dropapp \
-c "INSERT INTO audit_log (id, action, resource_type, resource_id, details) VALUES (gen_random_uuid(), 'secret_rotated', 'secret', 'JWT_SECRET', '{\"rotated_at\": \"$(date -u --iso-8601=seconds)\"}');"
5.2 Rotate Database Password
# 1. Generate new password
NEW_PASS=$(openssl rand -base64 32)
# 2. Update RDS master password
aws rds modify-db-instance \
--db-instance-identifier drop-db \
--master-user-password "$NEW_PASS" \
--apply-immediately \
--region eu-west-1
# 3. Update DATABASE_URL in Secrets Manager with new password
# 4. Trigger App Runner redeployment to pick up new DATABASE_URL
# 5. Verify health: curl https://getdrop.no/api/health
6. Database Operations
6.1 Connect to Production Database
Note: RDS must be accessible — either via VPN, bastion host, or AWS Systems Manager Session Manager.
psql -h drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com \
-U dropuser \
-d dropapp \
-c "SELECT 1;"
6.2 User Management Queries
-- Check user KYC status
SELECT id, email, kyc_status, auth_provider, created_at
FROM users WHERE email = '[email protected]';
-- List pending KYC users (> 24h)
SELECT id, email, kyc_status, created_at FROM users
WHERE kyc_status = 'pending'
AND created_at < NOW() - INTERVAL '24 hours'
ORDER BY created_at ASC;
-- Revoke all sessions for a user (emergency)
UPDATE sessions SET revoked = 1
WHERE user_id = 'usr_...' AND revoked = 0;
-- Soft-delete user (GDPR erasure)
UPDATE users SET deleted_at = NOW() WHERE id = 'usr_...';
UPDATE sessions SET revoked = 1 WHERE user_id = 'usr_...';
6.3 Transaction Queries
-- Recent transactions (last 24h)
SELECT id, type, status, send_amount, send_currency, created_at
FROM transactions
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC LIMIT 50;
-- Failed transactions (may need investigation)
SELECT t.*, u.email FROM transactions t
JOIN users u ON t.user_id = u.id
WHERE t.status = 'failed'
AND t.created_at > NOW() - INTERVAL '7 days'
ORDER BY t.created_at DESC;
-- AML: large transactions (> NOK 50,000)
SELECT * FROM transactions
WHERE send_amount > 50000
AND created_at > NOW() - INTERVAL '30 days'
ORDER BY send_amount DESC;
6.4 Manual RDS Snapshot
# Create manual snapshot before risky operations
aws rds create-db-snapshot \
--db-instance-identifier drop-db \
--db-snapshot-identifier drop-db-manual-$(date +%Y%m%d-%H%M) \
--region eu-west-1
# Wait for snapshot to complete
aws rds wait db-snapshot-completed \
--db-snapshot-identifier drop-db-manual-$(date +%Y%m%d-%H%M) \
--region eu-west-1
7. AML & Compliance Operations
7.1 AML Alert Review
-- View open AML alerts
SELECT a.*, u.email, t.send_amount, t.send_currency
FROM aml_alerts a
JOIN users u ON a.user_id = u.id
LEFT JOIN transactions t ON a.transaction_id = t.id
WHERE a.status = 'open'
ORDER BY a.created_at DESC;
-- Close an AML alert (after review)
UPDATE aml_alerts SET status = 'closed', reviewed_at = NOW(),
reviewer_notes = 'Reviewed — legitimate transaction'
WHERE id = 'alert_...';
7.2 STR Filing
If financial crime is suspected:
-- File STR
INSERT INTO str_reports (
id, user_id, transaction_id, report_type, details, filed_at, status
) VALUES (
gen_random_uuid(), 'usr_...', 'tx_...', 'suspicious_transaction',
'{"reason": "Unusual pattern", "amount": 50000}',
NOW(), 'filed'
);
Then contact Finanstilsynet via the official STR filing portal.
7.3 GDPR Requests
Data export request:
-- User data is exported via /api/user/data-export endpoint
-- Check data_access_requests table
SELECT * FROM data_access_requests WHERE user_id = 'usr_...' ORDER BY created_at DESC;
Erasure request:
-- Account deletion (soft delete)
UPDATE users SET deleted_at = NOW() WHERE id = 'usr_...';
UPDATE sessions SET revoked = 1 WHERE user_id = 'usr_...';
-- Note: data retained for 5 years per hvitvaskingsloven
8. Incident Response
8.1 Alert Triage
When a Slack alert fires in #drop-ops:
| Alert | First Response | Escalation |
|---|---|---|
| Health check DOWN | Run quick health check, check App Runner logs | After 5 min: restart App Runner |
| Error spike | Check CloudWatch logs for error pattern | After 10 min: escalate |
| App startup/shutdown | Informational — no action unless unexpected | N/A |
8.2 Common Issues
Issue: Health check returns 503 (DB unreachable)
# 1. Check RDS status
aws rds describe-db-instances --db-instance-identifier drop-db \
--query 'DBInstances[0].DBInstanceStatus' --output text --region eu-west-1
# 2. If not 'available', wait for AWS to auto-recover or follow DR Scenario 2
# 3. Check connection string in App Runner environment
# 4. Restart App Runner service
Issue: BankID login failing
# Check App Runner logs for BankID errors
aws logs filter-log-events \
--log-group-name /aws/apprunner/drop-web/8e45b0d335304487a1880f4e32d6aeec/application \
--filter-pattern "BankID" --region eu-west-1
# Verify BankID environment variables are set
# Check BankID status: https://driftsstatus.vippsmobilepay.com/
Issue: KYC verification stuck in pending
# Check Sumsub dashboard for stuck applicants
# Or query:
psql -c "SELECT id, email, kyc_status FROM users WHERE kyc_status='pending' AND created_at < NOW()-INTERVAL '2 hours';"
# Force-process via Sumsub dashboard or API call
9. Monitoring Verification Commands
# 1. Full health check
curl -s https://getdrop.no/api/health | python3 -m json.tool
# 2. Database latency check
curl -s https://getdrop.no/api/health | jq '.data.checks.db.latencyMs'
# Alert if > 100ms
# 3. Check app version
curl -s https://getdrop.no/api/health | jq '.data.version'
# 4. Check uptime
curl -s https://getdrop.no/api/health | jq '.data.uptime'
Related Documents
Approval
| Role | Name | Date | Signature |
|---|---|---|---|
| Author | Platform Architect (AI) | 2026-02-23 | |
| Reviewer | |||
| Approver | Alem Bašić |