BetterStack Setup

BetterStack Uptime Monitoring Setup Guide

Last updated: 2026-02-20 Related: MONITORING.md, health-check.sh Purpose: External uptime monitoring for Drop production environment

Why BetterStack?

BetterStack provides external uptime monitoring independent of Drop's infrastructure:

Detects infrastructure failures (AWS App Runner crashes, network issues)
Alerts when the entire application is unreachable
Provides uptime SLA tracking and historical reports
Multiple notification channels (Slack, Email, SMS)
Status page for client transparency

Key difference from internal health checks: Internal checks (Docker, Fly.io) only work when the container is running. BetterStack catches total outages.

Free Tier Limits

Plan: Free tier (no credit card required) Limits:

10 monitors (enough for Drop production)
3-minute check interval (paid plan: 30s minimum)
1 status page
Unlimited team members
Unlimited integrations (Slack, email, webhooks)

Upgrade required for:

Faster check intervals (<3 minutes)
More than 10 monitors (e.g., multi-region checks)
Advanced features (maintenance windows, custom headers)

Account Setup

Step 1: Create Account

Go to https://betterstack.com/uptime
Click "Start free trial" (becomes free tier after trial)
Sign up with Alem's email: alem@alai.no
Verify email address
Create workspace name: "ALAI Products" (shared across Drop, BasicFakta)

Step 2: Configure Team

Navigate to Settings > Team
Add team members:
- alem@alai.no (Owner)
- john@basicconsulting.no (Admin)
Set Default timezone: Europe/Oslo (UTC+1)

Monitor Configuration

Monitor 1: Health Endpoint (Primary)

Purpose: Verify API health and database connectivity

Go to Monitors > Create Monitor
Configure:
- Monitor name: Drop Health Check
- Monitor type: HTTP
- URL: https://drop.alai.no/api/health
- Check interval: 3 minutes (free tier)
- Request timeout: 5 seconds
- Method: GET
- Confirmation period: 30 seconds (1 retry before alerting)
Expected Response:
- Status code: 200
- Keyword check: Enable
  - Response body contains: "status":"ok"
  - Why: Ensures health endpoint returns valid JSON, not just HTTP 200
Advanced settings:
- Follow redirects: Enabled (default)
- Verify SSL certificate: Enabled
- SSL expiry warning: 14 days before expiration
Click Create Monitor

Monitor 2: Landing Page

Purpose: Verify public website availability

Go to Monitors > Create Monitor
Configure:
- Monitor name: Drop Landing Page
- Monitor type: HTTP
- URL: https://drop.alai.no
- Check interval: 3 minutes
- Request timeout: 10 seconds (landing page has more assets)
- Method: GET
- Confirmation period: 30 seconds
Expected Response:
- Status code: 200
- Keyword check: Enable
  - Response body contains: Send penger (tagline verification)
Click Create Monitor

Monitor 3: Multi-Region Health Check

Purpose: Detect regional networking issues

Go to Monitors > Create Monitor
Configure:
- Monitor name: Drop Health (US East)
- Monitor type: HTTP
- URL: https://drop.alai.no/api/health
- Check interval: 3 minutes
- Request timeout: 5 seconds
- Method: GET
- Confirmation period: 30 seconds
Expected Response:
- Status code: 200
- Keyword check: Response body contains "status":"ok"
Advanced settings:
- Region: US East (different from default EU region)
- Why: Detects if Drop is unreachable from specific geographies
Click Create Monitor

Slack Integration

Step 1: Create Slack Incoming Webhook

Go to your Slack workspace: alai-talk.slack.com
Navigate to Slack App Directory > Incoming Webhooks
Click Add to Slack
Select channel: #drop-ops (create if doesn't exist)
Click Add Incoming Webhooks Integration
Copy webhook URL (format: https://hooks.slack.com/services/T.../B.../XXX)
Save this URL securely (needed for BetterStack)

Step 2: Add Slack Integration in BetterStack

In BetterStack, go to Integrations
Click Add Integration > Slack
Paste webhook URL from Step 1
Configure:
- Integration name: Drop Ops Slack
- Notification channel: #drop-ops
Test integration: Click Send test message
- Verify message appears in #drop-ops channel
Click Save Integration

On-Call Team Setup

Step 1: Create On-Call Schedule

Go to On-Call > Create Schedule
Configure:
- Schedule name: Drop Primary On-Call
- Timezone: Europe/Oslo
Add rotation:
- Team member: alem@alai.no
- Schedule type: 24/7 (always on-call for now)
Click Create Schedule

Step 2: Configure Escalation Policy

Go to Escalation Policies > Create Policy
Configure:
- Policy name: Drop Production Incidents
Add escalation steps:

Step 1 (Immediate):
- Who: Drop Ops Slack integration
- Delay: 0 minutes
Step 2 (If still down after 5 minutes):
- Who: alem@alai.no (Email)
- Delay: 5 minutes
Step 3 (If still down after 15 minutes):
- Who: alem@alai.no (SMS) — Requires phone number
- Delay: 15 minutes
- Note: SMS requires paid plan or verified phone number
Click Create Policy

Step 3: Assign Policy to Monitors

Go to Monitors
For each monitor (Drop Health Check, Drop Landing Page, Drop Health (US East)):
- Click monitor name
- Go to Settings > Escalation Policy
- Select: Drop Production Incidents
- Click Save

Status Page Setup

Purpose

Public status page allows clients and stakeholders to check Drop availability without contacting support.

Step 1: Create Status Page

Go to Status Pages > Create Status Page
Configure:
- Page name: Drop Status
- Subdomain: drop-status (URL: https://drop-status.betteruptime.com)
- Custom domain (optional): status.drop.alai.no (requires DNS setup)
Design settings:
- Logo: Upload Drop logo (green rounded rectangle)
- Brand color: #0B6E35 (Drop primary green)
- Header text: Drop Status
- Tagline: Real-time service status and incident updates
Visibility:
- Public: Yes (anyone can view)
- Search engine indexing: No (prevent Google indexing)
Click Create Status Page

Step 2: Add Components

In the status page settings, go to Components
Click Add Component
Add three components:

Component 1:
- Name: API & Health Endpoint
- Linked monitor: Drop Health Check
- Description: Core API functionality and database connectivity
Component 2:
- Name: Landing Page
- Linked monitor: Drop Landing Page
- Description: Public website and marketing content
Component 3:
- Name: Global Network
- Linked monitor: Drop Health (US East)
- Description: International access and routing
Click Save Components

Step 3: Configure Incident Communication

Go to Status Pages > Settings > Incident Updates
Enable:
- Auto-create incidents: Yes (when monitor goes down)
- Auto-resolve incidents: Yes (when monitor recovers)
Notification subscribers:
- Email subscriptions: Enabled (users can subscribe to updates)
- Webhook notifications: Disabled (optional for future)

Internal: Add to #drop-ops Slack channel description
External: Link from Drop landing page footer (optional)
Clients: Include in onboarding emails

Status Page URL: https://drop-status.betteruptime.com

Verification Checklist

After completing setup, verify:

Monitors running: All 3 monitors show green status
Slack alerts working: Test by pausing a monitor (triggers down alert)
Email notifications working: Verify Alem receives email on test alert
Status page public: Open status page URL in incognito mode
Escalation policy assigned: All monitors use Drop Production Incidents policy
SSL expiry alerts: Monitors configured to warn 14 days before cert expiration

Testing the Setup

Test 1: Manual Down Alert

Go to Monitors > Drop Health Check
Click Pause Monitor (simulates downtime)
Expected behavior:
- Slack alert in #drop-ops within 30 seconds
- Email to alem@alai.no after 5 minutes (if still paused)
Click Resume Monitor to clear alert

Test 2: Actual Downtime

SSH into production server (or use AWS App Runner console)
Stop the Drop application container temporarily
Wait for BetterStack to detect downtime (max 3 minutes + 30s confirmation)
Expected behavior:
- Monitor shows red status
- Slack alert in #drop-ops
- Status page component shows "Down"
Restart application and verify recovery alert

Test 3: SSL Expiry Warning

Go to Monitors > Drop Health Check
Verify SSL expiry warning is enabled (14 days)
Expected behavior:
- Alert sent 14 days before SSL certificate expiration
- Action required: Renew certificate before expiry

Alert Examples

Downtime Alert (Slack)

🚨 Drop Health Check is DOWN

Monitor: Drop Health Check
Status: DOWN
Response: Connection timeout
Region: EU West
Time: 2026-02-20 10:30 UTC

View incident: https://betterstack.com/incidents/...

Recovery Alert (Slack)

✅ Drop Health Check is UP

Monitor: Drop Health Check
Status: UP
Response: 200 OK (2ms)
Downtime duration: 3 minutes
Time: 2026-02-20 10:33 UTC

Incident closed: https://betterstack.com/incidents/...

SSL Expiry Warning (Email)

Subject: [BetterStack] SSL certificate expiring in 14 days

Monitor: Drop Health Check
Domain: drop.alai.no
Certificate expiry: 2026-03-06 23:59 UTC

Action required: Renew SSL certificate before expiration.

Maintenance Mode

When performing planned maintenance (deployments, infrastructure upgrades):

Go to Maintenance Windows > Create Window
Configure:
- Name: Drop Deployment
- Start time: 2026-02-20 22:00 UTC
- Duration: 1 hour
- Affected monitors: Select all Drop monitors
Notification:
- Status page update: Yes (shows maintenance banner)
- Alert suppression: Yes (no downtime alerts during window)
Click Create Maintenance Window

Effect: During maintenance, downtime alerts are suppressed and status page shows "Scheduled Maintenance" instead of "Down".

Best Practices

Do's

✅ Test alerts monthly — Pause a monitor to verify escalation works
✅ Update on-call schedule — Rotate on-call duty if team grows
✅ Monitor SSL expiry — Enable 14-day warnings to prevent outages
✅ Use maintenance windows — Prevent false alerts during deployments
✅ Review incident history — Monthly review of downtime patterns

Don'ts

❌ Don't ignore degraded status — Investigate even if not fully down
❌ Don't disable monitors — Use pause for temporary suppression only
❌ Don't skip keyword checks — HTTP 200 alone doesn't guarantee working API
❌ Don't forget to update URLs — When domain changes, update all monitors
❌ Don't rely solely on external monitoring — Combine with internal health checks

Troubleshooting

Monitor shows false positives (frequent up/down)

Cause: Network instability or slow response times Fix:

Increase Request timeout from 5s to 10s
Increase Confirmation period from 30s to 60s
Check Drop API latency in logs

Slack alerts not received

Cause: Webhook URL incorrect or channel archived Fix:

Go to Integrations > Drop Ops Slack
Click Send test message
If fails, regenerate webhook in Slack and update BetterStack

Email alerts delayed

Cause: Email provider spam filtering Fix:

Whitelist notifications@betterstack.com in email settings
Check spam/junk folder
Verify email address in BetterStack team settings

Status page not updating

Cause: Monitor not linked to status page component Fix:

Go to Status Pages > Drop Status > Components
Ensure each component has a Linked monitor assigned
Save changes and trigger test alert

MONITORING.md — Full monitoring stack overview
health-check.sh — Internal health check script
alerts.ts — Slack alerting implementation
/api/health route — Health endpoint source code

Support

BetterStack Support:

Documentation: https://betterstack.com/docs
Email: support@betterstack.com
Status: https://status.betterstack.com

Internal Contact:

Slack: #drop-ops
Email: alem@alai.no

Revision #5
Created 2026-02-23 11:28:55 UTC by John
Updated 2026-05-23 10:58:24 UTC by John