Skip to main content

Infrastructure as Code

Infrastructure as Code

Project: Drop{{PROJECT_NAME}} Version: 0.1.0{{VERSION}} Date: 2026-02-23{{DATE}} Author: Platform Architect (AI){{AUTHOR}} Status: Draft | In Review | Approved Reviewers: Alem Bašić (CEO){{REVIEWERS}}

Document History

Version Date Author Changes
0.1 2026-02-23{{DATE}} Platform Architect (AI){{AUTHOR}} Initial draft from infrastructure audit

1. Overview

Drop's current production infrastructure (AWS App Runner + RDS) was provisioned manually via AWS Console. IaC tooling exists in the repository (infrastructure/ directory) as a cloud audit and WAF rules reference, but Terraform is not yet wired into the CI/CD pipeline. This document describes the target IaC state and the existing configuration.

IaC Tool: Terraform{{IAC_TOOL}} (target) — currently partially implemented Tool Version: TBD — requires team decision on version pinning{{IAC_VERSION}} Provider: AWS (hashicorp/aws){{CLOUD_PROVIDER}} Provider Version: ~> 5.0{{PROVIDER_VERSION}}

Rationale for tool choice:

Terraform chosen for its mature AWS provider, declarative HCL syntax, and cloud-agnostic design (future multi-cloud flexibility). Team has AWS CLI familiarity.

{{IAC_RATIONALE}}

Core Principles:

  • All infrastructure changes should go through code (no manual console changes in staging/prod once IaC is wired)prod)
  • IaC reviewed like application code (PR, review, merge)
  • State is the single source of truth
  • SecretsModules neverare storedversioned inand Terraform state (use AWS Secrets Manager data sources)reusable

2. Repository Structure

infrastructure/{{IaC_REPO}}/
├── cloud-audit.mdmodules/                    # ExistingReusable AWS resource inventory
├── waf-rules.md                # WAF configuration reference
├── terraform/                  # Target IaC (to be implemented)modules
│   ├── modules/networking/             # VPC, subnets, security groups
│   ├── app-runner/compute/                # AppEC2, RunnerECS, service + ECR
│Lambda
│   ├── rds/database/               # RDSRDS, PostgreSQL instanceElastiCache   ├── storage/                # S3, EFS
│   └── secrets/monitoring/             # AWSCloudWatch, Secretsalerts
Manager├── resourcesenvironments/               # Environment-specific configs
│   ├── environments/
│   │   ├── production/
│dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── terraform.tfvars
│   │   └── staging/
#   Fly.io   managed├── separatelymain.tf
(not   Terraform)│   ├── variables.tf
│   │   └── shared/terraform.tfvars
│   └── ecr.production/
│       ├── main.tf
#       ECR├── repositoryvariables.tf
(shared across envs)       └── terraform.tfvars
├── shared/                     # Shared resources (DNS, accounts)
├── scripts/                    # Helper scripts
│   ├── bootstrap.sh            # Initialize S3 state backend
│   └── validate.sh             # Pre-apply validation
├── .terraform-version          # Pin tool version (tfenv)
├── .tflint.hcl                 # Linting config
└── README.md

2.1 Module Organization

Module Purpose Key Inputs Key Outputs
modules/app-runnernetworking AppVPC, Runnersubnets, service, IAM roles, ECR image configrouting service_name,region, ecr_image_uri,cidr_block, env_varsaz_count service_arn,vpc_id, service_urlsubnet_ids, sg_ids
modules/rdscompute RDSECS PostgreSQLcluster, instance,task parameter group, subnet groupdefinitions instance_class,cluster_name, db_name, vpc_idinstance_type db_endpoint,cluster_arn, db_secret_arntask_role_arn
modules/secretsdatabase AWSRDS Secretsinstance, Managerparameter secretsgroups secret_name,engine, secret_valueinstance_class secret_arndb_endpoint, db_secret_arn
modules/ecrstorage ECRS3 repositorybuckets with lifecycle policypolicies bucket_name, purposebucket_arn, bucket_name
repository_namemodules/monitoring repository_urlCloudWatch dashboards, alarmsservice_name, thresholdsalarm_arns, dashboard_url

2.2 Environment Separation

  • Production: AWS (eu-west-1) — managed by Terraform
  • Staging: Fly.io — managed by Fly CLI (fly deploy) — NOT Terraform
  • Each environment directory is independently deployable
  • Environments call the same modules with different variable values
  • No cross-environment Terraformdependencies dependencies(except shared DNS zone)
  • Production has stricter apply controls (see Section 6)

2.3 Shared Modules

Shared module registry: Local{{MODULE_REGISTRY}} modules (no Terraform Registry for private modules yet)

Module SourceVersion Used By
modules/app-runnernetworking .{{REGISTRY}}/modules/app-runnernetworking ~> 2.0All environments
database{{REGISTRY}}/database~> 1.5Staging, Production
modules/rdsmonitoring .{{REGISTRY}}/modules/rdsProduction
modules/ecrmonitoring ./modules/ecr~> 1.2 ProductionAll (shared ECR)environments

3. State Management

3.1 Remote State Backend

Backend: S3{{STATE_BACKEND}} + DynamoDB (planned — not yet configured)

Environment State Location Access
Dev{{STATE_BUCKET}}/dev/terraform.tfstateDevOps team
Staging{{STATE_BUCKET}}/staging/terraform.tfstateDevOps team
Production s3:{{STATE_BUCKET}}//drop-terraform-state-324480209768/production/terraform.tfstate AlemSenior BašićDevOps + CI deploy roleonly

Bootstrap (first-time setup):

#bash Create S3 bucket for state storage
aws s3api create-bucket \
  --bucket drop-terraform-state-324480209768 \
  --region eu-west-1 \
  --create-bucket-configuration LocationConstraint=eu-west-1

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket drop-terraform-state-324480209768 \
  --versioning-configuration Status=Enabled

# Create DynamoDB lock table
aws dynamodb create-table \
  --table-name drop-terraform-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region eu-west-1

S3 backend configuration:

terraformscripts/bootstrap.sh {
  backend "s3" {
    bucket         = "drop-terraform-state-324480209768"
    key            = "production/terraform.tfstate"
    region         = "eu-west-1"
    dynamodb_table = "drop-terraform-locks"
    encrypt        = true
  }
ENVIRONMENT}}

3.2 State Locking

Locking Mechanism: DynamoDB{{LOCK_MECHANISM}} table drop-terraform-locks Lock timeout: 15 minutes (Terraform default){{LOCK_TIMEOUT}}s Force unlock: Only Alemby Bašićsenior DevOps after verifying no active apply

Lock table (if DynamoDB):

  • Table: {{LOCK_TABLE}}
  • Key: LockID
  • Billing: On-demand

3.3 State File Organization

Splitting strategy: Single{{SPLIT_STRATEGY}}

state file per environment (simple — low resource count)

roles,Manager
State File Contains Reason for split
production/base/terraform.tfstate App Runner service, ECR, RDS,Networking, IAM Infrequently Secretschanged
app/terraform.tfstateCompute, app servicesFrequently changed
data/terraform.tfstateDatabases, cachesHigh risk, separate lifecycle

4. Module Design

4.1 Naming Conventions

Resource naming pattern: drop-{environment}{PROJECT}}-{component}{ENVIRONMENT}}-{{COMPONENT}}-{{SUFFIX}}

Resource Example
App Runner ServiceVPC drop-production-webmyapp-prod-vpc
ECRECS RepositoryCluster drop-webmyapp-prod-cluster
RDS Instance drop-dbmyapp-prod-db-primary (existing)
SecretsS3 Manager SecretBucket drop/production/jwt-secretmyapp-prod-assets-{{ACCOUNT_ID}}
Security Groupmyapp-prod-app-sg
IAM Role (App Runner) drop-production-myapp-prod-app-runner-role
IAM Role (Deploy)drop-production-github-deploy-task-role

Current production resources (manually provisioned):

  • App Runner Service: drop-web (ARN: arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeec)
  • RDS Instance: drop-db (endpoint: drop-db.czu2qe4quy4v.eu-west-1.rds.amazonaws.com)
  • ECR Repository: drop-web (324480209768.dkr.ecr.eu-west-1.amazonaws.com/drop-web)

4.2 Input / Output Variables

AppRequired Runnervariable module example:fields:

variable "environment" {
  description = "Deployment environment (dev/staging/production)"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be stagingdev, staging, or production."
  }
}
variable

Required output fields:

output "ecr_image_uri"database_endpoint" {
  description = "FullThe ECRhostname imageof URIthe includingdatabase tag"
  type        = string
}

output "service_url" {
  description = "App Runner service URL"endpoint"
  value       = aws_apprunner_service.aws_db_instance.main.service_urlendpoint
  }

output "service_arn" {
  descriptionsensitive   = "App Runner service ARN"
  value       = aws_apprunner_service.main.arn
}

Secrets — never in state:

# Reference Secrets Manager — secret value NOT stored in Terraform state
data "aws_secretsmanager_secret_version" "jwt" {
  secret_id = "drop/production/jwt-secret"
}

# Pass ARN (not value) to App Runner
resource "aws_apprunner_service" "main" {
  source_configuration {
    image_repository {
      image_configuration {
        runtime_environment_secrets = {
          JWT_SECRET   = data.aws_secretsmanager_secret_version.jwt.arn
          DATABASE_URL = aws_secretsmanager_secret_version.db_url.arn
        }
      }
    }
  }false
}

4.3 Versioning Strategy

Module versioning: GitSemantic tags on IaC repositoryversioning (format: infra/v1.0.0MAJOR.MINOR.PATCH) Pin strategy: Reference~> byMAJOR.MINOR git(allow tagpatch inupdates, modulepin sourceminor) Upgrade policy: Review terraformand plan outputtest before applyingupgrading anyminor/major module version changeversions Changelog: Every inframodule changeversion bump requires ana CHANGELOG entry in infrastructure/CHANGELOG.md


5. Workflow

5.1 Standard Change Process

flowchart LR
    BRANCH[Create branch\ninfra/description]branch] --> CODE[Write/modify Terraform]IaC]
    CODE --> VALIDATE[terraform validate\n+validate terraform+ fmt]tflint]
    VALIDATE --> PLAN[terraform plan\nattach output to PR]plan]
    PLAN --> PR[Open PR]PR with plan output]
    PR --> REVIEW[ReviewPeer by Alem]review]
    REVIEW --> APPROVE[Approval]
    APPROVE --> APPLY[terraform apply\nmanualapply trigger]in CI]
    APPLY --> VERIFY[Verify via AWS console\n+ health check]resources]

Steps:

  1. Create feature branch: infra/{{TICKET}}-description
  2. Make changes, run terraform validate && terraform fmt
  3. Run terraform planpasteattach output intoto PR description
  4. Open PR for Alemreview reviews(at least 1 reviewer required for dev/staging, 2 for production)
  5. CI runs terraform plan automatically on PR open
  6. Merge → manualtriggers terraform apply in CI (automated CD for IaC pending)dev/staging)
  7. VerifyProduction viaapply Apprequires Runnermanual consoletrigger +after curlPR https://.../api/healthmerge

5.2 PR-Based Infrastructure Changes

PR Requirements:

  • Title: [IaC] {{ENVIRONMENT}}: description of change
  • Must include terraform plan output in PR description or CI artifact
  • Must include justification for the change
  • Must reference the related application ticket or(if justificationapplicable)
  • Must passhave terraformpassing validateCI andvalidation terraform(fmt, fmtvalidate, -checktflint, plan)

5.3 Automated Drift Detection

Schedule: Manual{{DRIFT_SCHEDULE}} before eachTool: production{{DRIFT_TOOL}} deployment (automatedAlert driftChannel: detection pending){{DRIFT_ALERT_CHANNEL}} Action on drift:

  1. Investigate cause (manual consolechange, changeprovider issue, external system)
  2. Either fix drift (apply IaC) or provider drift)
  3. Run terraform import to bring resource under management, or applyupdate IaC to reconcilereflect intentional change
  4. DocumentNever decisionleave indrift infrastructure/cloud-audit.mdunresolved for > {{DRIFT_SLA}}

6. Security

6.1 Least Privilege for IaC Service Account

Environment Service Account Permissions
Devci-iac-dev@{{PROJECT}}Full write within dev resources
Stagingci-iac-staging@{{PROJECT}}Full write within staging resources
Production GitHub Actions OIDC roleci-iac-prod@{{PROJECT}} apprunner:*,Restricted ecr:*,write, secretsmanager:GetSecretValuerequires scopedMFA to Drop resourcessession
# OIDC trust policy for GitHub Actions
data "aws_iam_policy_document" "github_oidc_trust" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    principals {
      type        = "Federated"
      identifiers = ["arn:aws:iam::324480209768:oidc-provider/token.actions.githubusercontent.com"]
    }
    condition {
      test     = "StringLike"
      variable = "token.actions.githubusercontent.com:sub"
      values   = ["repo:ALAI-org/drop:*"]
    }
  }
}

6.2 Secret Injection (Not in State)

Rule: Never pass passwords, API keys, or secrets as Terraform variablevariables values.Pattern: Reference secrets manager in resource configuration:

# CORRECTWRONGusesecret AWSin Secrets Manager, pass ARN to App Runnerstate
resource "aws_apprunner_service"aws_db_instance" "drop_web"main" {
  source_configurationpassword = var.db_password  # This will be in state in plaintext!
}

# RIGHT — secret from Secrets Manager
resource "aws_db_instance" "main" {
  image_repository {
      image_configuration {
        runtime_environment_secretsmanage_master_user_password = {true  JWT_SECRET# =AWS aws_secretsmanager_secret.jwt.arnmanages DATABASE_URLthe =password aws_secretsmanager_secret.db_url.arnin }Secrets }
    }
  }Manager
}

6.3 Policy as Code

Tool: tflint{{POLICY_TOOL}} + Checkov (planned for CI integration)

Policy Enforcement
RDSNo encryptionpublic atS3 restbuckets Block
RDS not publicly accessibleBlock
App Runner minimum 1 instance in productionWarn
All resources must have Project,environment Environment, ManagedBy tagstag Warn
SecretsRDS Managermust secretsbe in private subnetBlock
Security groups must not beallow in0.0.0.0/0 Terraformon variablesensitive valuesportsBlock
Encryption at rest required for data resources Block

7. Tagging Strategy

Required tags on all AWS resources:

Tag Value Purpose
Project drop{{PROJECT_NAME}} Cost attribution
Environment productiondev / staging / production Environment filter
ManagedBy terraform / manual Identifies IaCIaC-managed vs console-managedresources
Team alai{{TEAM}} Ownership
CostCenter{{COST_CENTER}}Finance attribution

Optional tags:

Tag Value Purpose
Service web / db / ecr{{SERVICE_NAME}} Service-level grouping
Ticket MC-XXXX{{TICKET_ID}} Change tracking
ExpiresAt{{DATE}}Ephemeral resource cleanup

8. Cost Management

Budget alerts:

  • Dev: Alert at ${{DEV_BUDGET}} / month
  • Staging: Alert at ${{STG_BUDGET}} / month
  • Production: Alert at $150/{{PROD_BUDGET}} / month (AWS Budgets — TBD setup)

Cost optimization built into IaC:

  • AppDev/staging Runner:auto-shutdown: No{{AUTO_SHUTDOWN_SCHEDULE}} running
  • Right-sizing: Instance types reviewed quarterly
  • Reserved instances when/ idlesavings (pay-per-requestplans: model)
  • Applied
  • RDSto db.t4g.micro: ARM Graviton (20% cheaper than x86 equivalent)
  • ECR lifecycle policy: Delete untagged images after 7 days, keep last 10 tagged imagesproduction
resource "aws_ecr_lifecycle_policy" "drop_web" {
  repository = aws_ecr_repository.drop_web.name
  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Keep last 10 tagged images"
        selection = { tagStatus = "tagged", countType = "imageCountMoreThan", countNumber = 10 }
        action = { type = "expire" }
      },
      {
        rulePriority = 2
        description  = "Remove untagged images after 7 days"
        selection = { tagStatus = "untagged", countType = "sinceImagePushed", countUnit = "days", countNumber = 7 }
        action = { type = "expire" }
      }
    ]
  })
}

9. Disaster Recovery for IaC State

State backup: S3{{STATE_BACKUP}} versioning enabled on drop-terraform-state-324480209768 bucket — all state versions preserved.

Recovery procedure:

  1. Restore from S3most versionrecent history: aws s3api list-object-versions --bucket drop-terraform-state-324480209768
  2. Download specific version: aws s3api get-object --version-id <version-id> ...backup
  3. Run terraform plan — verify no unexpected changes before apply

Existing manually-provisioned resources: 

  • If state is lost,unrecoverable: import manually:

    terraform import aws_apprunner_service.drop_webfor \each arn:aws:apprunner:eu-west-1:324480209768:service/drop-web/8e45b0d335304487a1880f4e32d6aeecmanaged terraformresource import(refer aws_db_instance.drop_dbto drop-dbresource 
    inventory)
  • Prevention:

    • S3 versioning enabled on state bucket
    • MFA delete required onfor state bucket (planned)
    • State bucket access logged to CloudTrail


    Approval

    Role Name Date Signature
    Author Platform Architect (AI) 2026-02-23
    Reviewer
    Approver Alem Bašić