Runbook: Swan API Outage
Runbook: Swan BaaS API Outage
Service: Swan Banking-as-a-Service Severity: CRITICAL (blocks accounts, cards, payments if Swan is primary provider) MTTR Target: <15 minutes Owner: John (AI Director)
Overview
Swan provides core banking infrastructure for Drop. Depending on Drop's architecture phase, Swan may handle:
- Account creation (virtual IBAN accounts for users)
- Card issuance (virtual/physical debit cards)
- Payment processing (domestic/international transfers)
- Balance management (wallet balances, not Open Banking)
Impact: If Swan is the primary BaaS provider, an outage affects ALL core banking operations.
Symptoms
Users report critical failures:
- Cannot create new account
- Cannot view wallet balance (if using Swan wallets)
- Card payments fail or decline
- Error: "Banking service unavailable"
- Dashboard shows "System error" for account-related features
User impact: Complete inability to use banking features (depending on Drop's reliance on Swan).
Diagnosis
1. Check Swan Status Page
External status:
# Swan official status page
open https://status.swan.io
# Check for:
# - Incident reported
# - Degraded performance
# - Scheduled maintenance
2. Test Swan API
Health check:
# GraphQL health query
curl https://api.swan.io/graphql \
-H "Authorization: Bearer <api-key>" \
-H "Content-Type: application/json" \
-d '{"query": "{viewer{id}}"}' \
-v
# Expected: HTTP 200, {"data": {"viewer": {"id": "..."}}}
# If 500/503: Swan API down
# If 401: Credential issue
# If timeout: Network or Swan connectivity issue
Test account creation:
# Attempt to create test account
curl https://api.swan.io/graphql \
-H "Authorization: Bearer <api-key>" \
-H "Content-Type: application/json" \
-d '{
"query": "mutation { createAccount(input: {name: \"Test Account\"}) { id } }"
}' \
-v
# Expected: HTTP 200 with account ID
# If error: Check response for Swan error codes
3. Check Drop Logs
# CloudWatch Logs (production)
aws logs filter-log-events \
--log-group-name /aws/apprunner/drop-production \
--filter-pattern "swan" \
--start-time $(date -u -d '15 minutes ago' +%s)000 \
--region eu-west-1
# Look for:
# - "Swan API timeout"
# - "Swan GraphQL error: INTERNAL_SERVER_ERROR"
# - "Swan 503 Service Unavailable"
# - "Swan rate limit exceeded"
4. Check Swan API Credentials
# Verify Swan API key is valid
bw get item "Swan API" --session $BW_SESSION
# Check App Runner environment variables
aws apprunner describe-service \
--service-arn <ARN> \
--region eu-west-1 \
| jq '.Service.SourceConfiguration.ImageRepository.ImageConfiguration.RuntimeEnvironmentVariables' \
| grep SWAN
# Expected:
# SWAN_API_KEY: <exists>
# SWAN_ENVIRONMENT: production (or sandbox)
# SWAN_PARTNER_ID: <partner-id>
5. Check Recent Swan API Changes
Review Swan changelog:
# Swan may deprecate API endpoints or change schemas
# Check Swan developer portal for breaking changes
open https://docs.swan.io/changelog
# Review recent GraphQL schema changes
# Verify Drop uses supported API versions
Common Causes & Solutions
Cause 1: Swan Service Outage (External)
Probability: 5% (Swan is highly reliable, but incidents happen)
Symptoms:
- Swan status page reports incident
- All Swan API calls fail with 500/503
- No error in Drop code/config
- Social media mentions Swan issues
Solution:
-
Verify outage scope:
- Check Swan status page
- Test API from different networks (rule out local network issue)
- Contact Swan support for ETA
-
Communicate to users (Norwegian):
Emne: Bankfunksjoner midlertidig utilgjengelig Hei, Vår bankinfrastruktur-leverandør (Swan) opplever tekniske problemer. Dette påvirker: - Kontoopprettelse - Korttransaksjoner - Overføringer Vi overvåker situasjonen og forventer at tjenesten er tilbake innen [X minutter/timer]. Mvh, Drop -
Enable degraded mode:
# Disable features that depend on Swan aws apprunner update-service --service-arn <ARN> \ --instance-configuration "EnvironmentVariables={ FEATURE_ACCOUNTS=disabled, FEATURE_CARDS=disabled, SWAN_MODE=degraded }" # Show maintenance banner in app -
Monitor Swan status:
- Subscribe to Swan status updates (RSS/email)
- Check every 10 minutes for resolution
- Test API as soon as Swan reports "Resolved"
-
Re-enable features when Swan is back:
aws apprunner update-service --service-arn <ARN> \ --instance-configuration "EnvironmentVariables={ FEATURE_ACCOUNTS=enabled, FEATURE_CARDS=enabled, SWAN_MODE=live }"
ETA: Depends on Swan (typically <2 hours for major incidents)
Cause 2: Invalid or Expired API Credentials
Probability: 15% (after credential rotation or Swan account changes)
Symptoms:
- Logs show: "401 Unauthorized" or "Forbidden"
- All Swan API requests fail immediately
- Swan API test returns authentication error
Solution:
-
Verify Swan API credentials:
bw get item "Swan API" --session $BW_SESSION # Check: # - API key is not expired # - API key has correct permissions (accounts, cards, payments) # - Partner ID is correct -
Regenerate API key (if needed):
- Login to Swan Dashboard: https://dashboard.swan.io
- Navigate to Settings → API Keys
- Revoke old key, generate new key
- Copy new key to Vaultwarden
-
Update App Runner environment variables:
aws apprunner update-service --service-arn <ARN> \ --source-configuration "ImageRepository={...}" \ --instance-configuration "EnvironmentVariables={ SWAN_API_KEY=<new-key>, SWAN_PARTNER_ID=<partner-id> }" -
Trigger deployment:
aws apprunner start-deployment --service-arn <ARN> --region eu-west-1 -
Test after deployment (3-5 min):
curl https://getdrop.no/api/accounts/create \ -H "Authorization: Bearer <test-user-token>" \ -H "Content-Type: application/json" \ -d '{"accountType": "personal"}' \ -v # Expected: HTTP 200, account created
ETA: 10 minutes
Cause 3: Swan API Rate Limiting
Probability: 10% (during high-traffic events or viral growth)
Symptoms:
- Logs show: HTTP 429 "Too Many Requests"
- Intermittent failures (some requests succeed, others fail)
- Rate limit headers in response
Solution:
-
Check rate limit headers:
aws logs filter-log-events \ --log-group-name /aws/apprunner/drop-production \ --filter-pattern "X-RateLimit" \ --start-time $(date -u -d '10 minutes ago' +%s)000 \ | jq -r '.events[].message' \ | grep Swan -
Implement request queuing:
// src/lib/swan-client.ts import PQueue from 'p-queue'; const queue = new PQueue({ concurrency: 5, // Max 5 concurrent Swan requests interval: 1000, // Per second intervalCap: 20 // Max 20 requests per second }); export async function swanGraphQL(query: string, variables?: any) { return queue.add(() => fetch('https://api.swan.io/graphql', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.SWAN_API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ query, variables }), }) ); } -
Exponential backoff on retry:
async function retrySwan(operation: () => Promise<any>, attempt = 1) { try { return await operation(); } catch (error) { if (error.status === 429 && attempt <= 3) { const delay = 1000 * Math.pow(2, attempt); // 2s, 4s, 8s await sleep(delay); return retrySwan(operation, attempt + 1); } throw error; } } -
Contact Swan to increase rate limit:
- Email Swan support with traffic stats
- Provide justification: user growth, peak times
- Request higher API quota
ETA: 5 minutes (automatic retry), 1-2 days (if quota increase needed)
Cause 4: Swan GraphQL Schema Change (Breaking)
Probability: 5% (Swan updates API, breaks Drop integration)
Symptoms:
- Logs show: "GraphQL validation error"
- Specific queries fail: "Field 'X' doesn't exist on type 'Y'"
- Swan API works for some operations, fails for others
Solution:
-
Check Swan changelog:
# Review recent API changes open https://docs.swan.io/changelog # Look for: # - Deprecated fields # - Required fields added # - Type changes -
Identify breaking changes:
# Compare current Drop queries to Swan schema # Example: account creation query grep -r "createAccount" src/lib/swan-client.ts # Cross-reference with Swan GraphQL schema # https://api.swan.io/graphql (GraphQL Playground) -
Update Drop GraphQL queries:
// Before (deprecated) mutation { createAccount(input: { name: "User Account" }) { id balance // ❌ Deprecated field } } // After (updated) mutation { createAccount(input: { name: "User Account" }) { id balances { // ✅ New field structure available currency } } } -
Test updated queries:
# Test in Swan GraphQL Playground first # Then deploy to staging # Verify all Swan-dependent features work -
Deploy fix:
git add src/lib/swan-client.ts git commit -m "Fix: Update Swan GraphQL queries to match latest schema" git push origin main # CI/CD triggers deployment
ETA: 30 minutes (if simple field change), 2 hours (if major refactor needed)
Cause 5: Network or Firewall Issues
Probability: 5% (AWS security group misconfiguration)
Symptoms:
- Logs show: "Connection timeout" or "ECONNREFUSED"
- Swan API requests never reach destination
- Works locally but fails in production
Solution:
-
Check outbound connectivity:
# App Runner egress is unrestricted by default # If using VPC connector, check security group aws ec2 describe-security-groups \ --group-ids <vpc-connector-sg> \ --region eu-west-1 \ | jq '.SecurityGroups[].IpPermissionsEgress' -
Test DNS resolution:
nslookup api.swan.io # Should resolve to Swan IPs # If NXDOMAIN: DNS issue -
Check AWS service health:
# Check App Runner service events aws apprunner list-operations \ --service-arn <ARN> \ --region eu-west-1 \ | jq '.OperationSummaryList[0]' -
Whitelist Swan IPs (if strict firewall):
- Contact Swan for IP ranges
- Add to security group outbound rules (port 443)
ETA: 15 minutes (if quick fix), 1 hour (if requires networking changes)
Cause 6: Swan Account Suspended or Payment Overdue
Probability: 2% (billing issue or compliance violation)
Symptoms:
- All Swan API calls fail with "Account suspended"
- Swan Dashboard shows billing alert
- Email from Swan about overdue payment or compliance issue
Solution:
-
Check Swan Dashboard:
- Login: https://dashboard.swan.io
- Look for alerts: billing, compliance, KYC
-
Resolve billing issue:
- If overdue payment: pay immediately via Swan Dashboard
- If billing method expired: update payment method
- Contact Swan billing: [email protected]
-
Resolve compliance issue:
- Swan requires KYC for partner accounts
- Upload missing documents (company registration, director ID, etc.)
- Respond to Swan compliance team ASAP
-
Request urgent reactivation:
- Email Swan support: [email protected]
- Subject: "URGENT: Account reactivation needed - [Partner ID]"
- Explain impact (users affected)
- Provide evidence of issue resolution
ETA: 15 minutes (if billing), 24 hours (if compliance review needed)
Emergency Workarounds
Option 1: Degraded Mode (Disable Swan Features)
Use case: Swan down >30 minutes, no ETA, users need core app functionality
Steps:
-
Disable Swan-dependent features:
aws apprunner update-service --service-arn <ARN> \ --instance-configuration "EnvironmentVariables={ FEATURE_ACCOUNTS=disabled, FEATURE_CARDS=disabled, FEATURE_SWAN_WALLETS=disabled }" -
Show banner in app:
⚠️ Noen funksjoner er midlertidig utilgjengelige Kontoopprettelse og korttransaksjoner er ikke tilgjengelig for øyeblikket. Andre funksjoner virker som normalt. -
Allow core features to work:
- BankID login: ✅ (not Swan-dependent)
- Open Banking balance: ✅ (uses Neonomics, not Swan)
- PISP payments: ✅ (uses Neonomics, not Swan)
- Swan accounts: ❌ (disabled)
-
Re-enable when Swan is back:
aws apprunner update-service --service-arn <ARN> \ --instance-configuration "EnvironmentVariables={ FEATURE_ACCOUNTS=enabled, FEATURE_CARDS=enabled, FEATURE_SWAN_WALLETS=enabled }"
Risk: Users cannot create accounts or use cards during outage.
Option 2: Queue Swan Operations for Later
Use case: Swan down, users need to create accounts but can wait
Steps:
-
Queue account creation requests:
// src/app/api/accounts/create/route.ts export async function POST(request: Request) { const { accountType } = await request.json(); try { return await swanClient.createAccount(accountType); } catch (error) { if (error.code === 'SWAN_UNAVAILABLE') { // Queue for later processing await db.insert('pending_accounts', { user_id: userId, account_type: accountType, status: 'queued', created_at: new Date(), }); return { success: true, message: 'Account creation queued, will complete within 1 hour', }; } throw error; } } -
Process queue when Swan is back:
# Run cron job to process pending accounts node ~/ALAI/products/Drop/scripts/process-pending-accounts.js -
Notify users when account is ready:
Din konto er klar! Takk for tålmodigheten. Du kan nå bruke alle funksjoner i Drop.
Risk: Delayed user experience. Users may expect instant account creation.
Monitoring & Alerts
Metrics to Track
- Swan API success rate: Should be >99%
- Swan API latency: p50 <500ms, p95 <2s, p99 <5s
- Swan error rate by operation: Track createAccount, issueCard, makePayment separately
Alert Rules
// src/lib/swan-monitor.ts
export async function trackSwanFailure(operation: string, error: any) {
const failureRate = await calculateSwanFailureRate('last_5_minutes');
if (failureRate > 0.05) { // 5% failure rate
await sendAlert({
severity: 'critical',
title: 'Swan API failure rate high',
message: `${(failureRate * 100).toFixed(1)}% of Swan calls failing`,
operation,
});
}
}
Post-Incident Actions
-
Process queued operations:
SELECT * FROM pending_accounts WHERE status = 'queued'; -- Retry all pending account creations -
Document incident:
touch ~/ALAI/products/Drop/comms/incidents/$(date +%Y-%m-%d)-swan-outage.md -
Review SLA with Swan:
- Check if outage violated SLA
- Request compensation/credits
- Discuss failover options
-
Improve resilience:
- Add Swan health check (every 5 min)
- Implement circuit breaker for Swan API
- Consider multi-provider strategy (backup BaaS)
Escalation
| Time | Action |
|---|---|
| 0 min | John starts diagnosis |
| 5 min | If Swan status page shows incident, notify Alem |
| 15 min | If not resolved, enable degraded mode |
| 30 min | Contact Swan support via phone if no ETA |
| 1 hour | Public communication to users |
Contacts
- Swan Support: [email protected]
- Swan Phone: +33 X XXXX XXXX (check Swan Dashboard for number)
- Swan Status: https://status.swan.io
- Internal: Alem (CEO, final decision on feature disabling)
Related Documentation
docs/architecture/banking.md— Swan BaaS integrationsrc/lib/swan-client.ts— Swan GraphQL clientdocs/compliance/swan-requirements.md— Swan partner KYC/compliance- Vaultwarden item: "Swan API" — Credentials
Last Updated: 2026-02-22 Next Review: Before Phase 2 (Banking Integration)
No comments to display
No comments to display