Skip to main content

Data Flow Document

Data Flow Document

Project: Drop Version: 1.0 Date: 2026-02-23 Author: Petter Graff, Senior Enterprise Architect Status: Approved Reviewers: Alem Bašić (CEO), John (AI Director) Classification: Confidential

Document History

Version Date Author Changes
0.1 2026-02-23 Petter Graff Initial draft from real architecture and schema docs

1. Data Flow Overview

System: Drop Payment Platform (ALAI Holding AS) Data Owner: Alem Bašić, CEO — [email protected] DPO Contact: [email protected] (DPO designation TBD — required before production launch per GDPR Art. 37)

Overview: Drop processes personal, financial, and compliance data for Norwegian residents (18+) using a PSD2 pass-through model. Data enters from three surfaces: user registration (BankID OIDC), transaction initiation (web + mobile clients), and compliance webhooks (Sumsub KYC). Data is stored in 19 PostgreSQL tables, with sensitive fields (national ID, IBAN) hashed or encrypted. Data exits through the REST API (user-facing), PISP calls (payment rails), and regulatory reporting (Finanstilsynet, Okokrim).

Critical Architectural Principle: Drop never holds customer funds. bank_accounts.balance is a cached AISP read from the user's actual bank — not a Drop balance. No wallet, no top-up.

High-Level Data Flow

flowchart LR
    subgraph Inputs["Data Sources / Ingestion"]
        BID["BankID OIDC\n(Identity + age)"]
        WEB["Web App\n(Next.js)"]
        MOB["Mobile App\n(Expo)"]
        SUM["Sumsub Webhooks\n(KYC results)"]
        OB["Open Banking AISP\n(Balance reads)"]
    end

    subgraph Processing["Drop API Processing"]
        AUTH["Auth Middleware\n(JWT verify, session check)"]
        VAL["Input Validation\n(validateName, sanitizeText,\nvalidateAmount, validatePhone)"]
        BIZ["Business Logic\n(Fee calc, FX, disclosure,\nKYC enforcement)"]
        AUDIT["Audit + Compliance\n(audit_log, AML rules, STR)"]
    end

    subgraph Storage["Drop Database (PostgreSQL 16)"]
        CORE["Core Domain\nusers, sessions,\ntransactions, bank_accounts,\nrecipients, merchants,\nnotifications, settings"]
        COMPLIANCE["Compliance Domain\naudit_log, aml_alerts,\nstr_reports, screening_results,\nconsents, data_access_requests,\ncomplaints"]
        SYSTEM["System\nexchange_rates, rate_limits,\ncards (future), spending_limits (future)"]
    end

    subgraph Outputs["Egress / Data Consumers"]
        API_OUT["REST API\n(Web + Mobile clients)"]
        PISP_OUT["Open Banking PISP\n(Payment initiation to user's bank)"]
        SEPA["SEPA/SWIFT\n(Remittance settlement)"]
        REG["Regulatory Reporting\nFinanstilsynet + Okokrim (STR)"]
        SENTRY["Sentry\n(Errors — PII masked)"]
    end

    BID --> AUTH --> VAL --> BIZ
    WEB & MOB --> AUTH
    SUM --> VAL
    OB --> CORE

    BIZ --> CORE & COMPLIANCE & SYSTEM
    BIZ --> AUDIT
    AUDIT --> COMPLIANCE

    CORE --> API_OUT
    CORE --> PISP_OUT --> SEPA
    COMPLIANCE --> REG
    BIZ --> SENTRY

2. Data Sources & Ingestion

Source Type Protocol Volume (est. MVP) Format PII? Validation
BankID OIDC ID token Real-time (per login) HTTPS OIDC callback 100-1000 logins/day JWT (signed) YES — pid, name JWKS signature + issuer + nonce + age >= 18
Web app user actions Real-time HTTPS POST 500-5000 req/day JSON YES — IBAN, amount Schema validation + validateName/validateAmount
Mobile app user actions Real-time HTTPS POST (Bearer) 200-2000 req/day JSON YES Same as web
Sumsub webhooks Event-driven HTTPS POST (inbound) 10-100 webhooks/day JSON YES — KYC result HMAC-SHA256 signature verification
Open Banking AISP On-demand + scheduled HTTPS GET 4 reads/account/day max JSON (Berlin Group) YES — IBAN, balance Response schema validation + consent check

Ingestion Error Handling

Error Type Action Notification
BankID signature invalid Reject — return jwks_verification_failed 502 Sentry HIGH alert
Schema validation failure Reject — return 400 validation_error with field details No alert (expected)
Sumsub HMAC mismatch Reject — return 401 Sentry HIGH alert — possible SUMSUB_SECRET_KEY rotation issue
Open Banking consent expired Reject AISP call — prompt user to re-link In-app notification
PII field with unexpected format Reject + log (masked) Sentry MEDIUM alert

3. Data Transformations

3.1 Ingestion Transformations (before storage)

Step Input Transformation Output Notes
1. PID hashing Raw pid from BankID (11 digits) SHA-256 hash users.national_id_hash Raw pid NEVER stored
2. Name splitting name claim from BankID (e.g., "Ola Nordmann") Split on first space users.first_name, users.last_name Applied in bankid.ts:findOrCreateUser
3. Birthdate extraction pid digits DDMMYY + individual number Parse century + DOB users.date_of_birth (ISO: YYYY-MM-DD) parseBirthdateFromPid() in bankid.ts
4. Input sanitization Raw user text (name, description) sanitizeText() — strip HTML, trim whitespace Clean strings Prevents XSS; applied in lib/middleware/validation.ts
5. Balance conversion AISP balance in decimal NOK ("45230.00") Multiply by 100 → integer øre bank_accounts.balance (integer, øre) Avoids floating-point money arithmetic
6. Fee calculation Send amount (NOK) amount * 0.005 → round 2 decimals transactions.fee Applied in transactions route before INSERT
7. Exchange rate application Send amount + rate from exchange_rates amount * rate → round to whole units transactions.receive_amount
8. JWT issuance userId, email placeholder, role Sign with jose HS256 + JWT_SECRET drop_token cookie (web) / Bearer JWT (mobile) 24h (web) / 7d (mobile)

3.2 ETL Pipeline

Drop does not have a separate data warehouse or ETL pipeline at MVP scale. All analytics queries run directly against the PostgreSQL replica (when available). A data warehouse is planned for Phase 3 (10K+ users).


4. Data Storage

Storage System Technology Purpose Data Classification Encryption at Rest
Primary Database PostgreSQL 16 (AWS RDS) All transactional and compliance data — 19 tables Restricted / Confidential AES-256 (RDS storage encryption, AWS-managed key)
Development Database SQLite 3 (/app/data/drop.db) Local dev only — no real PII Internal OS-level (developer machine)
Secrets AWS Secrets Manager JWT_SECRET, BankID credentials, DB credentials, Sentry DSN Critical AES-256 (AWS KMS)
Backups RDS automated backups + snapshots Disaster recovery Restricted AES-256 (RDS)
Error Logs Sentry Error events (PII masked) Internal Sentry's encryption
App Logs AWS CloudWatch Request/response logs (PII excluded) Internal CloudWatch encryption
Container images AWS ECR Docker images — no data, no secrets Internal ECR encryption at rest

5. Data Access Patterns

5.1 Read Patterns

Consumer Data Accessed Frequency Access Method Caching
Web app (GET /api/auth/me) User profile, bank accounts, total balance Per page load REST API Balance: bank_accounts.balance_synced_at (6h staleness)
Web app (GET /v1/transactions) Transaction list (paginated) Per page load REST API (paginated, limit 20) No cache — real-time from DB
Mobile app (GET /v1/transactions) Transaction list Per page load REST API No cache
Mobile app (GET /v1/recipients) Saved recipients Per send-money flow REST API No cache
Web app (GET /v1/rates/:currency) Exchange rate Per amount-entry REST API Rate limited to 120/min — DB row read
Admin (GET /api/admin/users) User list with KYC status On demand REST API (admin-only) No cache

5.2 Write Patterns

Writer Data Written Frequency Write Method Consistency
Auth module (BankID callback) users, sessions, audit_log Per login Atomic DB transaction Strong
Transactions route transactions, bank_accounts (balance deduct), audit_log, notifications Per payment Atomic DB transaction Strong
Sumsub webhook users.kyc_status, screening_results, audit_log, notifications Per KYC event Atomic DB transaction Strong (within Drop)
Rate limit middleware rate_limits (count + reset_at) Every rate-limited request Direct DB write Eventual (cleanup every 100 calls)
AISP balance sync bank_accounts.balance, bank_accounts.balance_synced_at Up to 4x/day per account Direct DB write Eventual (cached value)

6. Data Retention & Archival

Data Category Retention Period Legal Basis Action at Expiry Automated?
User account data Duration of relationship + 5 years Contract (Finansavtaleloven) Soft delete → anonymize (nullify PII fields) Planned — nightly job
Transaction records 5 years Legal obligation (Hvitvaskingsloven § 22) Archive to cold storage (planned) Planned
AML alerts + STR reports 5 years Legal obligation (Hvitvaskingsloven § 22) Retain — not deletable per AML law No (legal retention)
Audit logs (audit_log) 5 years Legitimate interest (security + compliance) Purge Planned
Session tokens (sessions) 7 days max Technical necessity Expired expires_at rows pruned on login Yes (session cleanup on login)
GDPR consents Until consent withdrawn Consent (GDPR Art. 6(1)(a)) Delete within 30 days of withdrawal Manual + planned job
KYC screening results 5 years Legal obligation Archive Planned
Notifications 90 days Legitimate interest (UX) Delete Planned
Rate limit counters 60 seconds (TTL = reset_at) Technical necessity Auto-cleaned every 100 calls Yes (middleware cleanup)
Error logs (Sentry) 90 days Legitimate interest Auto-purged by Sentry Yes (Sentry retention policy)
RDS backups 30 days (prod), 7 days (staging) Business continuity Overwrite (rolling) Yes (RDS automated)

AML Override: GDPR Art. 17 (right to erasure) is overridden by Hvitvaskingsloven § 22 — transaction and KYC data must be retained 5 years regardless of user erasure request. Drop implements soft delete + anonymization (nullify email, first_name, last_name, phone) while retaining financial and compliance records.


7. Data Quality Rules

7.1 Validation Rules

Field Rule Error Action Severity
users.national_id_hash SHA-256 hex string (64 chars) Reject login CRITICAL
users.date_of_birth ISO date, age >= 18 years Reject login CRITICAL
transactions.amount 100 ≤ amount ≤ 50000 (NOK) Reject payment CRITICAL
transactions.idempotency_key Unique (UNIQUE index) Reject duplicate, return existing CRITICAL
recipients.country One of: RS, BA, PL, PK, TR, EU Reject recipient creation HIGH
bank_accounts.iban Valid IBAN format (application-layer check) Reject bank linking HIGH
users.email Placeholder format [email protected] N/A (generated) LOW
Input text fields (name, description) Sanitized (no HTML, max lengths per field) Reject with 422 MEDIUM

7.2 Data Quality Metrics

Metric Target Alert Threshold
Null rate on users.national_id_hash 0% Any occurrence
Transaction idempotency violations prevented 100% Any duplicate slip-through
KYC webhook HMAC validation pass rate 100% < 100% → alert ops
AISP balance staleness > 24h 0% > 5% of accounts → alert

8. PII Data Flow Mapping

8.1 PII Inventory

PII Category Fields Storage Location Encrypted? Access Controls Lawful Basis
Norwegian national ID users.national_id_hash PostgreSQL users table SHA-256 one-way hash Auth middleware + admin only Contract + Legal obligation
Full name users.first_name, users.last_name PostgreSQL users table At-rest (RDS encryption) Auth middleware (own data) + admin Contract
Date of birth users.date_of_birth PostgreSQL users table At-rest Auth middleware (own data) + admin Contract + Legal obligation (age verification)
Phone number users.phone PostgreSQL users table At-rest Own data + admin Contract
IBAN / bank account bank_accounts.iban, bank_accounts.account_number, recipients.bank_account PostgreSQL At-rest Own data only (user_id FK enforced) Contract
Cached bank balance bank_accounts.balance PostgreSQL At-rest Own data only Contract (AISP)
IP address audit_log.ip_address, consents.ip_address, rate_limits.key PostgreSQL At-rest Admin only Legitimate interest (security)
KYC documents Not stored by Drop Sumsub servers Sumsub handles Sumsub dashboard only Legal obligation

8.2 PII Flow Diagram

flowchart TD
    USER([Norwegian Resident\nData Subject]) -->|"BankID OIDC\n(pid, name, birthdate)"| INGESTION[Drop Auth Module]
    INGESTION -->|"SHA-256(pid) = national_id_hash\nname → first/last_name\nbirthdate from pid"| DB[(PostgreSQL\nusers table\nPII encrypted at rest)]
    DB -->|"Own profile only\n(JWT-gated)"| API_OUT["REST API\n/api/auth/me"]
    DB -->|"national_id_hash only\n(never raw pid)"| AUDIT[audit_log]

    USER -->|"Document upload"| SUMSUB_SDK[Sumsub SDK Widget]
    SUMSUB_SDK -->|"Applicant data\n+ document images"| SUMSUB[(Sumsub Servers\nKYC documents stored)]
    SUMSUB -->|"kyc_status result\n(no raw doc data)"| DROP_WEBHOOK[Drop Webhook Handler]
    DROP_WEBHOOK -->|"kyc_status update\nscreening_result"| DB

    DB -->|"GDPR Art. 17 erasure\n(soft delete + anonymize)"| ANONYMIZE[Anonymization\nNullify: email, first_name,\nlast_name, phone\nRetain: transactions, AML (5yr)]
    ANONYMIZE --> DB

    DB -->|"PISP: amount + IBAN\n(user's own bank)"| PISP_OUT["Open Banking PISP\n(Neonomics → ASPSP)"]
    PISP_OUT -->|"Execute transfer"| BANK[(User's Bank\nmoney always here)]

    DB -.->|"STR reports\n(hvitvaskingsloven)"| REG["Okokrim / EFE\n(Regulatory)"]

    style DB fill:#ffcccc
    style SUMSUB fill:#ffffcc
    style BANK fill:#ccffcc
    style REG fill:#ffcccc

9. Cross-Border Data Transfer

Transfer From To Data Category Mechanism DPA Signed?
KYC applicant data Norway (Drop) International (Sumsub) Name, DOB, document images Standard Contractual Clauses (SCCs) TBD — required before production
Error events Norway (Drop) USA (Sentry) Error stack traces (PII masked) Standard Contractual Clauses (SCCs) Sentry DPA via ToS
BankID auth Norway (user browser) Norway (BankID Norge) OIDC tokens, pid Norwegian entity — no cross-border transfer N/A
Neonomics AISP/PISP Norway (Drop API) Norway/EEA (Neonomics) IBAN, balance, payment data EEA entity — adequacy DPA required in contract

Third-party processors with data access:

Processor Service Data Accessed DPA Signed Location
Sumsub KYC/AML verification applicantId, external user ID, KYC result (documents stored by Sumsub, not Drop) Required International
Sentry Error tracking Error messages (PII must be masked before capture) Via Sentry ToS DPA USA (SCCs)
AWS Cloud hosting All Drop data (encrypted at rest) AWS DPA eu-north-1 (Stockholm) — EEA
Neonomics Open Banking aggregator IBAN, balance, payment details Required in commercial contract Norway / EEA

10. Data Lineage Tracking

Lineage tool: audit_log table — custom implementation Coverage: All user-triggered data mutations captured

Lineage Events Captured

{
  "id": "audit_<hex16>",
  "user_id": "usr_abc123",
  "action": "transaction.create | kyc.approved | session.create | ...",
  "resource_type": "transaction | user | session | ...",
  "resource_id": "tx_abc123",
  "ip_address": "192.168.1.1",
  "created_at": "2026-02-23T10:00:00.000Z"
}

Actions tracked in audit_log:

  • session.create, session.revoke, session.revoke_all
  • transaction.create, transaction.status_update
  • kyc.initiated, kyc.approved, kyc.rejected
  • qr_payment.create
  • user.delete_account, user.data_export, user.data_export_request
  • complaint.create
  • bankid.login, bankid.callback_error

11. Backup & Recovery for Data

Storage Backup Method Frequency Retention RTO RPO Test Frequency
PostgreSQL (RDS prod) Continuous automated backups (PITR) + daily snapshots Continuous / Daily 30 days 1 hour 5 minutes Monthly
PostgreSQL (RDS staging) Automated backups Daily 7 days 2 hours 24 hours Quarterly
SQLite (dev) Git-ignored; no backup (dev-only) N/A N/A Rebuild from seed N/A N/A
Secrets (Secrets Manager) AWS-managed replication Continuous Indefinite (versioned) < 5 min (create new) N/A N/A

PostgreSQL Point-in-Time Recovery:

aws rds restore-db-instance-to-point-in-time \
  --source-db-instance-identifier drop-prod \
  --target-db-instance-identifier drop-prod-restored \
  --restore-time "2026-02-23T10:00:00Z"

Last backup test: TBD — Required before production launch Recovery runbook: docs/dr-runbook.md


Approval

Role Name Date Signature
Author Petter Graff 2026-02-23
Data Owner Alem Bašić (CEO)
DPO / Privacy TBD — required before production
Security
Tech Lead John (AI Director)