Infrastructure as Code
Infrastructure as Code
Project: Drop Version: 0.1.0 Date: 2026-02-23 Author: Platform Architect (AI) Status: In Review Reviewers: Alem Bašić (CEO)
Document History
| Version | Date | Author | Changes |
|---|---|---|---|
| 0.1 | 2026-02-23 | Platform Architect (AI) | Initial draft from infrastructure audit |
1. Overview
Drop's current production infrastructure (AWS App Runner + RDS) was provisioned manually via AWS Console. IaC tooling exists in the repository (infrastructure/ directory) as a cloud audit and WAF rules reference, but Terraform is not yet wired into the CI/CD pipeline. This document describes the target IaC state and the existing configuration.
IaC Tool: Terraform (target) — currently partially implemented
Tool Version: TBD — requires team decision on version pinning
Provider: AWS (hashicorp/aws)
Provider Version: ~> 5.0
Rationale for tool choice: Terraform chosen for its mature AWS provider, declarative HCL syntax, and cloud-agnostic design (future multi-cloud flexibility). Team has AWS CLI familiarity.
Core Principles:
- All infrastructure changes should go through code (no manual console changes in staging/prod once IaC is wired)
- IaC reviewed like application code (PR, review, merge)
- State is the single source of truth
- Secrets never stored in Terraform state (use AWS Secrets Manager data sources)
2. Repository Structure
infrastructure/
├── cloud-audit.md # Existing AWS resource inventory
├── waf-rules.md # WAF configuration reference
├── terraform/ # Target IaC (to be implemented)
│ ├── modules/
│ │ ├── app-runner/ # App Runner service + ECR
│ │ ├── rds/ # RDS PostgreSQL instance
│ │ └── secrets/ # AWS Secrets Manager resources
│ ├── environments/
│ │ ├── production/
│ │ │ ├── main.tf
│ │ │ ├── variables.tf
│ │ │ └── terraform.tfvars
│ │ └── staging/ # Fly.io managed separately (not Terraform)
│ └── shared/
│ └── ecr.tf # ECR repository (shared across envs)
└── scripts/
└── bootstrap.sh # Initialize S3 state backend
2.1 Module Organization
| Module | Purpose | Key Inputs | Key Outputs |
|---|---|---|---|
modules/app-runner |
App Runner service, IAM roles, ECR image config | service_name, ecr_image_uri, env_vars |
service_arn, service_url |
modules/rds |
RDS PostgreSQL instance, parameter group, subnet group | instance_class, db_name, vpc_id |
db_endpoint, db_secret_arn |
modules/secrets |
AWS Secrets Manager secrets | secret_name, secret_value |
secret_arn |
modules/ecr |
ECR repository with lifecycle policy | repository_name |
repository_url |
2.2 Environment Separation
- Production: AWS (
eu-west-1) — managed by Terraform - Staging: Fly.io — managed by Fly CLI (
fly deploy) — NOT Terraform - Each environment independently deployable
- No cross-environment Terraform dependencies
2.3 Shared Modules
| Module | Source | Used By |
|---|---|---|
modules/app-runner |
./modules/app-runner |
Production |
modules/rds |
./modules/rds |
Production |
modules/ecr |
./modules/ecr |
Production (shared ECR) |
3. State Management
3.1 Remote State Backend
Backend: S3 + DynamoDB (planned — not yet configured)
| Environment | State Location | Access |
|---|---|---|
| Production | s3://drop-terraform-state-324480209768/production/terraform.tfstate |
Alem Bašić + CI deploy role |
Bootstrap (first-time setup):
# Create S3 bucket for state storage
aws s3api create-bucket \
--bucket drop-terraform-state-324480209768 \
--region eu-west-1 \
--create-bucket-configuration LocationConstraint=eu-west-1
# Enable versioning
aws s3api put-bucket-versioning \
--bucket drop-terraform-state-324480209768 \
--versioning-configuration Status=Enabled
# Create DynamoDB lock table
aws dynamodb create-table \
--table-name drop-terraform-locks \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region eu-west-1
S3 backend configuration:
terraform {
backend "s3" {
bucket = "drop-terraform-state-324480209768"
key = "production/terraform.tfstate"
region = "eu-west-1"
dynamodb_table = "drop-terraform-locks"
encrypt = true
}
}
3.2 State Locking
Locking Mechanism: DynamoDB table drop-terraform-locks
Lock timeout: 15 minutes (Terraform default)
Force unlock: Only Alem Bašić after verifying no active apply
3.3 State File Organization
Splitting strategy: Single state file per environment (simple — low resource count)
| State File | Contains |
|---|---|
production/terraform.tfstate |
App Runner service, ECR, RDS, IAM roles, Secrets Manager |
4. Module Design
4.1 Naming Conventions
Resource naming pattern: drop-{environment}-{component}
| Resource | Example |
|---|---|
| App Runner Service | drop-production-web |
| ECR Repository | drop-web |
| RDS Instance | drop-db (existing) |
| Secrets Manager Secret | drop/production/jwt-secret |
| IAM Role (App Runner) | drop-production-app-runner-role |
| IAM Role (Deploy) | drop-production-github-deploy-role |
Current production resources (manually provisioned):
- App Runner Service:
drop-web(ARN:arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec) - RDS Instance:
drop-db(endpoint:drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com) - ECR Repository:
drop-web(324480209768.dkr.ecr.eu-west-1.amazonaws.com/drop-web)
4.2 Input / Output Variables
App Runner module example:
variable "environment" {
description = "Deployment environment (staging/production)"
type = string
validation {
condition = contains(["staging", "production"], var.environment)
error_message = "Environment must be staging or production."
}
}
variable "ecr_image_uri" {
description = "Full ECR image URI including tag"
type = string
}
output "service_url" {
description = "App Runner service URL"
value = aws_apprunner_service.main.service_url
}
output "service_arn" {
description = "App Runner service ARN"
value = aws_apprunner_service.main.arn
}
Secrets — never in state:
# Reference Secrets Manager — secret value NOT stored in Terraform state
data "aws_secretsmanager_secret_version" "jwt" {
secret_id = "drop/production/jwt-secret"
}
# Pass ARN (not value) to App Runner
resource "aws_apprunner_service" "main" {
source_configuration {
image_repository {
image_configuration {
runtime_environment_secrets = {
JWT_SECRET = data.aws_secretsmanager_secret_version.jwt.arn
DATABASE_URL = aws_secretsmanager_secret_version.db_url.arn
}
}
}
}
}
4.3 Versioning Strategy
Module versioning: Git tags on IaC repository (format: infra/v1.0.0)
Pin strategy: Reference by git tag in module source
Upgrade policy: Review terraform plan output before applying any module version change
Changelog: Every infra change requires an entry in infrastructure/CHANGELOG.md
5. Workflow
5.1 Standard Change Process
flowchart LR
BRANCH[Create branch\ninfra/description] --> CODE[Write/modify Terraform]
CODE --> VALIDATE[terraform validate\n+ terraform fmt]
VALIDATE --> PLAN[terraform plan\nattach output to PR]
PLAN --> PR[Open PR]
PR --> REVIEW[Review by Alem]
REVIEW --> APPROVE[Approval]
APPROVE --> APPLY[terraform apply\nmanual trigger]
APPLY --> VERIFY[Verify via AWS console\n+ health check]
Steps:
- Create feature branch:
infra/description - Make changes, run
terraform validate && terraform fmt - Run
terraform plan— paste output into PR description - Open PR — Alem reviews
- Merge → manual
terraform apply(automated CD for IaC pending) - Verify via App Runner console +
curl https://.../api/health
5.2 PR-Based Infrastructure Changes
PR Requirements:
- Title:
[IaC] description of change - Must include
terraform planoutput - Must reference the related ticket or justification
- Must pass
terraform validateandterraform fmt -check
5.3 Automated Drift Detection
Schedule: Manual before each production deployment (automated drift detection pending) Action on drift:
- Investigate cause (manual console change or provider drift)
- Run
terraform importto bring resource under management, or apply IaC to reconcile - Document decision in
infrastructure/cloud-audit.md
6. Security
6.1 Least Privilege for IaC Service Account
| Environment | Service Account | Permissions |
|---|---|---|
| Production | GitHub Actions OIDC role | apprunner:*, ecr:*, secretsmanager:GetSecretValue scoped to Drop resources |
# OIDC trust policy for GitHub Actions
data "aws_iam_policy_document" "github_oidc_trust" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
principals {
type = "Federated"
identifiers = ["arn:aws:iam::324480209768:oidc-provider/token.actions.githubusercontent.com"]
}
condition {
test = "StringLike"
variable = "token.actions.githubusercontent.com:sub"
values = ["repo:ALAI-org/drop:*"]
}
}
}
6.2 Secret Injection (Not in State)
Rule: Never pass passwords, API keys, or secrets as Terraform variable values.
# CORRECT — use AWS Secrets Manager, pass ARN to App Runner
resource "aws_apprunner_service" "drop_web" {
source_configuration {
image_repository {
image_configuration {
runtime_environment_secrets = {
JWT_SECRET = aws_secretsmanager_secret.jwt.arn
DATABASE_URL = aws_secretsmanager_secret.db_url.arn
}
}
}
}
}
6.3 Policy as Code
Tool: tflint + Checkov (planned for CI integration)
| Policy | Enforcement |
|---|---|
| RDS encryption at rest | Block |
| RDS not publicly accessible | Block |
| App Runner minimum 1 instance in production | Warn |
All resources must have Project, Environment, ManagedBy tags |
Warn |
| Secrets Manager secrets must not be in Terraform variable values | Block |
7. Tagging Strategy
| Tag | Value | Purpose |
|---|---|---|
Project |
drop |
Cost attribution |
Environment |
production / staging |
Environment filter |
ManagedBy |
terraform / manual |
Identifies IaC vs console-managed |
Team |
alai |
Ownership |
| Tag | Value | Purpose |
|---|---|---|
Service |
web / db / ecr |
Service-level grouping |
Ticket |
MC-XXXX |
Change tracking |
8. Cost Management
Budget alerts:
- Production: Alert at $150/month (AWS Budgets — TBD setup)
Cost optimization built into IaC:
- App Runner: No running instances when idle (pay-per-request model)
- RDS db.t4g.micro: ARM Graviton (20% cheaper than x86 equivalent)
- ECR lifecycle policy: Delete untagged images after 7 days, keep last 10 tagged images
resource "aws_ecr_lifecycle_policy" "drop_web" {
repository = aws_ecr_repository.drop_web.name
policy = jsonencode({
rules = [
{
rulePriority = 1
description = "Keep last 10 tagged images"
selection = { tagStatus = "tagged", countType = "imageCountMoreThan", countNumber = 10 }
action = { type = "expire" }
},
{
rulePriority = 2
description = "Remove untagged images after 7 days"
selection = { tagStatus = "untagged", countType = "sinceImagePushed", countUnit = "days", countNumber = 7 }
action = { type = "expire" }
}
]
})
}
9. Disaster Recovery for IaC State
State backup: S3 versioning enabled on drop-terraform-state-324480209768 bucket — all state versions preserved.
Recovery procedure:
- Restore from S3 version history:
aws s3api list-object-versions --bucket drop-terraform-state-324480209768 - Download specific version:
aws s3api get-object --version-id <version-id> ... - Run
terraform plan— verify no unexpected changes before apply
Existing manually-provisioned resources: If state is lost, import manually:
terraform import aws_apprunner_service.drop_web \
arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec
terraform import aws_db_instance.drop_db drop-db
Prevention:
- S3 versioning enabled on state bucket
- MFA delete required on state bucket (planned)
- State bucket access logged to CloudTrail
Related Documents
Approval
| Role | Name | Date | Signature |
|---|---|---|---|
| Author | Platform Architect (AI) | 2026-02-23 | |
| Reviewer | |||
| Approver | Alem Bašić |