Skip to main content

ADR-017: RLS Strategy Forward Decisions

ADR-017 — RLS Strategy: Forward Decisions

Status: Accepted — forward-decision supplement to ADR-017-RLS-MULTITENANCY.md Date: 2026-05-13 Author: Bruce Momjian (Database Architecture, CodeCraft) Architecture Review: Petter Graff (CodeCraft) Decision-maker: CEO Alem Bašić — Phase 2C flip requires explicit CEO sign-off per §5 MC Task: #100586 (Phase 0' ADR-017 RLS strategy forward decisions) Relationship to existing ADR-017: ADR-017-RLS-MULTITENANCY.md (signed 2026-05-11) records the three-phase migration decision and the Phase 2A PERMISSIVE DDL that is now live on disk (V16, V17). This document does NOT re-document what is already live. It defines the forward decisions that were deferred from Phase 0' because V17 was still being executed:

  • Phase 2C RESTRICTIVE flip gate criteria (executable checklist)
  • bilko_app role grant matrix (per-table, per-operation)
  • Service-account bypass policy (Flyway, admin jobs, monitoring)
  • Connection pool integration (OrgScopePlugin + HikariCP + SET LOCAL)
  • CountryPlugin interaction: does country filtering compose with org filtering?
  • Rollback procedure if RESTRICTIVE flip breaks production traffic

Cross-references:

  • ADR-017-RLS-MULTITENANCY.md (Phase 0' decision + Phase 2A DDL)
  • V16__country_constraint.sql (on disk — organizations.country NOT NULL + CHECK)
  • V17__rls_permissive.sql (on disk — PERMISSIVE policies + bilko_app role creation)
  • OrgScopeSessionVariable.kt (on disk — OrgScopePlugin + orgTransaction pattern)
  • ADR-015-FOUR-JURISDICTION-PLUGIN.md (TaxJurisdiction enum driving country_code)
  • ADR-023 §6 (single-DB supersession triggers — not fired as of 2026-05-13)
  • Plan v3 §4c (RLS timing rationale)

1. Current State (tool-verified 2026-05-13)

V16 and V17 are applied on disk. Phase 2A is complete.

Item State
organizations.country NOT NULL, CHECK constraint LIVE — V16 applied
PERMISSIVE RLS on 9 tables (organizations, invoices, invoice_items, expenses, transactions, bank_transactions, bank_accounts, accounts, contacts) LIVE — V17 applied
bilko_app role exists (created idempotently in V17 DO block) LIVE
FORCE ROW LEVEL SECURITY on 8 tables (all except organizations) LIVE
OrgScopePlugin + orgTransaction wrapper LIVE — OrgScopeSessionVariable.kt
Application-layer WHERE org_id clauses (ADR-005) STILL ACTIVE — dual-layer per Phase 2A design
Phase 2C RESTRICTIVE flip NOT YET — gated on this document + Securion audit + 30d soak

The plan v3 §2 stated "V1..V15 applied" — that was the state at 2026-05-11 plan authoring. Martin Kleppmann's panel review (MC #100586 genesis) correctly identified that V16 and V17 have since been applied. This ADR accepts that reality and governs only forward decisions.


2. bilko_app Role Grant Matrix

V17 creates bilko_app via a DO block if it does not exist, but does not grant it any table-level privileges. The Ktor application connects as the bilko database user (Cloud SQL IAM SA), not as bilko_app. The relationship between the two roles must be explicit so that RLS policies targeting TO bilko_app actually fire.

2.1 Decision: Connection User Grants bilko_app

The application runtime connects as bilko (or its Cloud SQL IAM SA equivalent). bilko_app is a marker role used by RLS policies. The connection user must be a member of bilko_app so that policies targeting TO bilko_app apply.

-- V22__bilko_app_grants.sql (Phase 2A cleanup — run once per environment)
-- ZAKON: Run after V17 is confirmed applied. This is idempotent via IF NOT EXISTS.

-- Grant bilko_app membership to the application runtime user.
-- Cloud SQL IAM SA user is 'bilko' (or the IAM SA email shortname).
-- REPLACE 'bilko' with the actual Cloud SQL user if it differs per environment.
GRANT bilko_app TO bilko;

-- Grant DML on all 9 tenant tables to bilko_app.
-- RLS policies will further restrict these grants per session variable.
-- SELECT is required on organizations for org-lookup at login time.
GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE organizations         TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE invoices              TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE invoice_items         TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE expenses              TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE transactions          TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE bank_transactions     TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE bank_accounts         TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE accounts              TO bilko_app;

GRANT SELECT, INSERT, UPDATE, DELETE
    ON TABLE contacts              TO bilko_app;

-- Sequence grants: required for INSERT on tables with UUID default (gen_random_uuid()
-- requires no sequence; SERIAL tables require USAGE on the sequence).
-- Current schema uses gen_random_uuid() — no sequence grants needed.
-- If any table is changed to BIGSERIAL in future: GRANT USAGE, SELECT ON SEQUENCE <seq> TO bilko_app;

-- Schema USAGE: allows bilko_app to reference objects in the public schema.
GRANT USAGE ON SCHEMA public TO bilko_app;

2.2 Grant Matrix Summary

Table SELECT INSERT UPDATE DELETE Notes
organizations YES YES YES YES Org creation at onboarding; UPDATE for settings; RLS restricts to own row
invoices YES YES YES YES Core workflow; RLS restricts to org's rows
invoice_items YES YES YES YES Sub-rows joined via invoices; RLS via subquery on invoices
expenses YES YES YES YES Core workflow
transactions YES YES NO NO Transactions are immutable once created; UPDATE/DELETE blocked at grant level for defence in depth
bank_transactions YES YES NO NO Imported from bank feed; immutable after import
bank_accounts YES YES YES YES Account management (add/remove accounts)
accounts YES YES YES YES Chart of accounts; per-org customization
contacts YES YES YES YES Customer/supplier management

Immutability enforcement on transactions and bank_transactions: UPDATE and DELETE are revoked at the role grant level. This is defence-in-depth: the application layer also prohibits these operations, but revoking at the DB role level means an application bug cannot accidentally overwrite a ledger entry. Corrections are handled via compensating journal entries, not direct UPDATE.

2.3 What bilko_app Does NOT Get

  • No TRUNCATE on any table (prevents accidental mass deletion)
  • No CREATE TABLE, DROP TABLE, ALTER TABLE (schema changes are Flyway-only)
  • No access to logged_actions table directly — audit inserts go through a trigger function owned by bilko_admin that fires on DML to audited tables
  • No access to flyway_schema_history table
  • No SUPERUSER, CREATEDB, CREATEROLE privileges

3. Service-Account Bypass Policy

Several non-application processes must query the database without being bound by RLS policies. This section defines how each process bypasses RLS and the controls that prevent privilege creep.

3.1 Flyway (Schema Migrations)

Connection user: bilko_admin (Cloud SQL built-in superuser, or the cloudsqlsuperuser role holder provisioned at DB creation).

RLS bypass mechanism: PostgreSQL superusers bypass RLS by default. bilko_admin is a superuser and therefore never subject to RLS policies, regardless of app.current_org_id.

Control: bilko_admin credentials are available ONLY in:

  1. Cloud Build (CI/CD pipeline) — injected as a Secret Manager secret at migration step
  2. Vaultwarden break-glass vault — for emergency direct access only
  3. Local .env.local for developer environments — never committed

bilko_admin credentials are NOT stored in the Cloud Run service's environment. The application process cannot access bilko_admin at runtime. This ensures that even a compromised application binary cannot use Flyway-level privileges.

Verification: SHOW is_superuser; in a bilko_admin session returns on. SET ROLE bilko_app; SHOW is_superuser; returns off — RLS applies.

3.2 Background Jobs (Recurring Invoice Generator, Exchange Rate Fetcher)

Background jobs that process multiple organizations must set app.current_org_id per org iteration. The pattern is:

// apps/api/src/main/kotlin/no/alai/bilko/jobs/RecurringInvoiceJob.kt (example)

fun run() {
    val allOrgs = transaction {
        // Admin query: fetch all org IDs that have due recurring invoices.
        // This query runs WITHOUT app.current_org_id set — bilko_admin connection
        // (or a connection that has been explicitly granted BYPASSRLS).
        // Alternative: use a dedicated job-runner DB user with BYPASSRLS.
        exec("SET LOCAL app.current_org_id = ''")  // explicit empty = no org context
        Organizations.selectAll().map { it[Organizations.id] }
    }

    allOrgs.forEach { orgId ->
        orgTransaction(orgId.toString()) {
            // All queries inside this block are RLS-scoped to orgId.
            // generateDueInvoices() only sees this org's recurring invoices.
            generateDueInvoices(orgId)
        }
    }
}

Decision: Background jobs that require cross-org visibility for the enumeration phase (e.g., "find all orgs with due invoices") use a dedicated role bilko_job with BYPASSRLS privilege. The per-org processing phase uses orgTransaction() (RLS-scoped).

-- V22 (addendum): create bilko_job role for background job enumeration phase
CREATE ROLE bilko_job BYPASSRLS NOINHERIT NOCREATEDB NOCREATEROLE;
GRANT CONNECT ON DATABASE bilko TO bilko_job;
GRANT USAGE ON SCHEMA public TO bilko_job;
GRANT SELECT ON TABLE organizations TO bilko_job;  -- enumeration only
-- bilko_job does NOT get DML grants — enumeration is SELECT-only.
-- Actual DML happens via orgTransaction() under bilko_app RLS scope.

Rationale: Giving bilko_job BYPASSRLS for SELECT-only enumeration is a narrower privilege than making it a superuser. bilko_job has no INSERT/UPDATE/DELETE anywhere. If a job process is compromised, an attacker can enumerate org IDs but cannot modify data.

3.3 Monitoring Queries (Health Checks, Metrics)

Monitoring processes (Cloud SQL insights, Prometheus pg_stat exporter, health check endpoints) do not carry org context. They query system tables, not tenant tables.

Pattern: Monitoring uses a dedicated bilko_monitor role:

-- bilko_monitor: read-only system monitoring
CREATE ROLE bilko_monitor NOINHERIT NOCREATEDB NOCREATEROLE;
GRANT CONNECT ON DATABASE bilko TO bilko_monitor;
GRANT USAGE ON SCHEMA public TO bilko_monitor;
GRANT SELECT ON TABLE pg_stat_activity TO bilko_monitor;
-- bilko_monitor has NO grants on any tenant tables.
-- It cannot access organizations, invoices, etc. regardless of RLS.

Health check queries (SELECT 1) do not require any tenant table grants. The application health endpoint should use a dedicated bilko_health connection that does not set app.current_org_id and only executes SELECT 1 or SELECT NOW().

3.4 Backup Processes (pg_dump, Cloud SQL export)

Cloud SQL automated backups operate at the storage layer, not via SQL connections. They bypass all RLS, role grants, and session variables — backups capture the full database state unconditionally. This is correct and expected.

For application-level backup exports (e.g., per-org data export for GDPR Article 20 data portability), the export service uses orgTransaction() — RLS-scoped, identical to normal application queries. The export service role is bilko_app (not bilko_admin).

3.5 Role Hierarchy Summary

Role BYPASSRLS Grants Purpose
bilko_admin YES (superuser) All Flyway migrations only
bilko_job YES (explicit) SELECT on organizations Background job enumeration
bilko_app NO DML on 9 tenant tables Application runtime
bilko_monitor NO None on tenant tables Health/metrics monitoring
bilko_health NO None Health check SELECT 1

4. Connection Pool Integration (OrgScopePlugin + HikariCP)

4.1 Current Implementation (on disk)

OrgScopeSessionVariable.kt (tool-verified 2026-05-13) implements:

  • OrgScopePlugin — Ktor ApplicationPlugin that stores org ID in ThreadLocal per request
  • orgTransaction(organizationId, block) — wraps Exposed transaction {} with SET LOCAL app.current_org_id
  • setOrgContext(orgId) / clearOrgContext() — utility functions for explicit lifecycle management

Critical design invariant: SET LOCAL scopes the session variable to the current transaction boundary. When the transaction commits or rolls back, the variable is automatically unset. The connection returned to HikariCP does NOT carry the variable.

This means:

  1. Every orgTransaction {} call must set the variable — it cannot be set once at connection checkout and assumed to persist across multiple transactions.
  2. The HikariCP connection pool does not need to be modified or connection-tested for variable state — SET LOCAL ensures clean state on every connection return.
  3. Coroutine-based code must ensure orgTransaction is called on the same coroutine dispatcher as the database access (Dispatchers.IO for Exposed). The ThreadLocal in OrgScopePlugin is a request-lifecycle convenience read — the authoritative value is the one passed explicitly to orgTransaction.

4.2 Failure Modes and Guards

Failure: app.current_org_id not set (PERMISSIVE phase)

If orgTransaction() is not called (e.g., a new route uses plain transaction {} instead), current_setting('app.current_org_id', true) returns NULL or '' (empty string).

The V17 PERMISSIVE policies guard against this:

CASE WHEN current_setting('app.current_org_id', true) IS NULL
          OR current_setting('app.current_org_id', true) = ''
     THEN false   -- deny: returns 0 rows for bilko_app role
     ELSE <org_id_check>
END

In PERMISSIVE phase: the application-layer WHERE org_id clause is still active. If a developer forgets orgTransaction(), the RLS policy returns 0 rows (PERMISSIVE deny), but the application WHERE clause would also filter correctly — the user sees empty results rather than cross-tenant data. This is not a data leak but is a correctness bug (user sees 0 results instead of their own data). Detected by integration tests.

Failure: app.current_org_id not set (RESTRICTIVE phase, post-Phase 2C)

Same missing orgTransaction() call in RESTRICTIVE mode: the policy fails closed — 0 rows returned. This is the intended security-fail-closed behavior. The application sees an empty result set and surfaces a 404 or an empty list to the user. No cross-tenant data exposure. Detected by integration tests that assert non-zero results for authenticated requests.

Failure: malformed UUID in app.current_org_id

current_setting('app.current_org_id', true)::uuid will throw invalid input syntax for type uuid if the value is not a valid UUID. OrgScopePlugin (line 168–170 of OrgScopeSessionVariable.kt) validates UUID format before storing in ThreadLocal:

runCatching { UUID.fromString(orgId) }.onFailure {
    log.error("[OrgScope] BilkoPrincipal.organizationId is not a valid UUID: {}", orgId)
    return@onCall  // request continues without RLS context; app-layer WHERE fires
}

In Phase 2A (PERMISSIVE): malformed UUID means no RLS context → app-layer WHERE clause is the effective guard. In Phase 2C (RESTRICTIVE): same behavior. The policy evaluates NULL::uuid which errors, but the CASE WHEN NULL guard catches it first and returns false (deny).

Additional guard required for Phase 2C: Before flipping to RESTRICTIVE, the USING clause must be hardened for the UUID cast:

-- Phase 2C policy pattern (fail-closed UUID handling):
USING (
    CASE
        WHEN current_setting('app.current_org_id', true) IS NULL
          OR current_setting('app.current_org_id', true) = ''
        THEN false
        WHEN length(current_setting('app.current_org_id', true)) <> 36
        THEN false
        ELSE organization_id = current_setting('app.current_org_id', true)::uuid
    END
)

The length check prevents the UUID cast from throwing on clearly non-UUID values. A correct ::uuid cast validates the format fully; the length pre-check just avoids exceptions for obviously invalid input.

4.3 Thread Safety Invariant

OrgScopePlugin uses ThreadLocal<String?> (line 75 of OrgScopeSessionVariable.kt). Ktor coroutines on Dispatchers.IO may run on different threads for different suspensions within the same request. This means the ThreadLocal value may not be visible after a coroutine suspension that crosses a thread boundary.

Decision: ThreadLocal in OrgScopePlugin is used ONLY for service-layer convenience reads via currentOrgId(). The authoritative RLS enforcement uses the explicit orgId parameter passed to orgTransaction(organizationId, block) — this parameter is captured in the coroutine closure and is not thread-dependent.

Rule: Service-layer code must never call currentOrgId() and pass the result to orgTransaction(). The org ID must come from BilkoPrincipal.organizationId sourced from the JWT, propagated explicitly through the call chain.


5. CountryPlugin Interaction with RLS

5.1 Isolation Layers Are Independent

RLS enforces org-level isolation (which organization owns this row). CountryPlugin enforces jurisdiction-level behavior (which rules apply to this org).

These two concerns are orthogonal and compose without interference:

Request arrives → JWT validated → BilkoPrincipal extracted
  → org_id from principal → orgTransaction(org_id) sets app.current_org_id
  → RLS enforces: only this org's rows are visible
  → org.country from principal → CountryPlugin resolved via Ktor DI
  → CountryPlugin.calculateVat() / generateEInvoiceXml() etc. apply jurisdiction rules
  → All DB queries inside the transaction see only this org's rows (RLS)
  → All jurisdiction logic uses CountryPlugin for this org's country

RLS does NOT filter by country. A Croatian org (country = 'HR') and a Serbian org (country = 'RS') in the same database are isolated from each other by the app.current_org_id session variable — not by country. Country determines behavior (VAT rates, invoice format), not data visibility.

5.2 Cross-Country Filtering Does Not Layer on Top of RLS

There is no "country-aware RLS" layer. The RLS policy for invoices is:

USING (organization_id = current_setting('app.current_org_id', true)::uuid)

It checks org_id only. A request from an HR org will only ever see HR org rows because app.current_org_id contains the HR org's UUID — not because the policy checks country.

This is intentional. Adding AND country_code = 'HR' to the policy would be:

  1. Redundant (org-level isolation already prevents cross-org access)
  2. Dangerous (if country_code is NULL or migrated incorrectly, the policy would deny access to legitimate data)
  3. Incorrect for admin/Flyway queries that legitimately need cross-country access

Decision: RLS policies filter by organization_id only. Country filtering is the responsibility of the application layer (via CountryPlugin) and is not enforced in RLS.

5.3 chart_of_accounts: Cross-Org, Country-Scoped

The chart_of_accounts table (defined in ADR-017-RLS-MULTITENANCY.md §2.3) is a reference table shared across all orgs in the same jurisdiction. It is NOT org-scoped.

Decision: No RLS policy is applied to chart_of_accounts. It is a read-only reference table accessible to all bilko_app connections. Write access is restricted to Flyway migrations and admin tooling (bilko_admin).

-- chart_of_accounts grants (part of V22 or separate V23):
GRANT SELECT ON TABLE chart_of_accounts TO bilko_app;
-- No INSERT/UPDATE/DELETE for bilko_app — CoA entries are seeded by Flyway only.

This means an HR org could in theory query RS CoA entries (if they knew to ask). In practice, the application always calls CountryPlugin.getChartOfAccountsDefaults() which queries CoA filtered by jurisdiction = org.country — the application enforces the filter; RLS does not need to.


6. Phase 2C RESTRICTIVE Flip — Gate Criteria and Executable Checklist

Phase 2C converts PERMISSIVE RLS policies to RESTRICTIVE and retires the ADR-005 application-layer WHERE org_id clauses. This is the highest-risk migration step.

6.1 Gate Criteria (all must be true before Phase 2C begins)

  1. Securion audit complete — no critical findings on Phase 2A policies
  2. 30-day soak metric thresholds met — see §6.2
  3. Fail-closed test suite passing — see §6.3
  4. CEO explicit sign-off — separate from the ADR-017-RLS-MULTITENANCY sign given on 2026-05-11 (that sign covered Phase 2A only; Phase 2C requires a new sign)
  5. Rollback plan confirmed — dry-run on a clone of the stage database before production

6.2 30-Day Soak Metric Thresholds

These metrics are measured against the stage Cloud SQL instance from the day V17 applies until Phase 2C is authorized. All thresholds must be met for a continuous 30-day window.

Metric Threshold How to verify
Cross-tenant RLS bypass incidents 0 Securion rogue-role probe results in CI; psql -c "SET ROLE bilko_app; SET LOCAL app.current_org_id = '<wrong_uuid>'; SELECT COUNT(*) FROM invoices;" must return 0
app.current_org_id unset errors 0 per day Application logs: no [OrgScope] BilkoPrincipal.organizationId is not a valid UUID ERROR lines; no current_setting PG errors in Cloud SQL slow query log
FORCE RLS bypass attempts 0 Check pg_audit or Cloud SQL logs for SET ROLE / RESET ROLE commands outside of expected test sequences
RLS policy evaluation errors (p99) < 0.5ms added latency EXPLAIN ANALYZE on top-10 query patterns before/after V17; p99 query latency per Cloud SQL insights
Application integration test suite 100% passing CI green on every push to main; failing RLS tests = soak clock reset

6.3 Fail-Closed Test Scenarios (Securion Checklist)

These test scenarios MUST be executed and pass before Phase 2C is authorized. Securion (Parisa Tabriz) owns this checklist and provides a signed evidence file.

Core Isolation Tests

  • Rogue-role SELECT: SET ROLE bilko_app; SET LOCAL app.current_org_id = '<org_a_uuid>'; SELECT COUNT(*) FROM invoices WHERE organization_id = '<org_b_uuid>'; Expected: 0 rows (even though the WHERE clause explicitly asks for org B).

  • Rogue-role UPDATE: SET ROLE bilko_app; SET LOCAL app.current_org_id = '<org_a_uuid>'; UPDATE invoices SET status = 'CANCELLED' WHERE organization_id = '<org_b_uuid>'; Expected: 0 rows updated (not an error — UPDATE silently affects 0 rows, which is correct fail-closed behavior).

  • Rogue-role INSERT: SET ROLE bilko_app; SET LOCAL app.current_org_id = '<org_a_uuid>'; INSERT INTO invoices (id, organization_id, ...) VALUES (gen_random_uuid(), '<org_b_uuid>', ...); Expected: WITH CHECK violation / 0 rows inserted (RESTRICTIVE WITH CHECK clause must deny cross-org inserts).

  • Missing context SELECT: SET ROLE bilko_app; SELECT COUNT(*) FROM invoices; (no SET LOCAL app.current_org_id) Expected: 0 rows (null/empty context returns false in USING clause).

  • Empty string context SELECT: SET ROLE bilko_app; SET LOCAL app.current_org_id = ''; SELECT COUNT(*) FROM invoices; Expected: 0 rows.

  • Malformed UUID context: SET ROLE bilko_app; SET LOCAL app.current_org_id = 'not-a-uuid'; SELECT COUNT(*) FROM invoices; Expected: 0 rows (length check in USING clause returns false before UUID cast).

invoice_items Subquery Isolation

  • Cross-org invoice_items via subquery: Org A attempts to read invoice_items for an invoice owned by Org B. SET ROLE bilko_app; SET LOCAL app.current_org_id = '<org_a_uuid>'; SELECT COUNT(*) FROM invoice_items WHERE invoice_id = '<org_b_invoice_uuid>'; Expected: 0 rows (invoice_items USING clause subquery on invoices returns empty set for wrong org).

bank_transactions Subquery Isolation

  • Cross-org bank_transactions via subquery: Same pattern as invoice_items. Expected: 0 rows.

bilko_admin Bypass Verification

  • bilko_admin can see all rows: SET ROLE bilko_admin; SELECT COUNT(*) FROM invoices; Expected: returns actual total row count (superuser bypass confirmed — this is CORRECT behavior for admin).

  • bilko_admin FORCE RLS is NOT set on organizations: SELECT relrowsecurity, relforcerowsecurity FROM pg_class WHERE relname = 'organizations'; Expected: relrowsecurity = true, relforcerowsecurity = false (organizations is RLS-enabled but NOT FORCE — bilko_admin can see all orgs for onboarding/support).

Application Integration Tests (Proveo, not Securion)

  • PUT /api/v1/invoices/{id} with cross-tenant JWT: HTTP 404 (not 500, not 200)
  • GET /api/v1/invoices/{id}/pdf with cross-tenant JWT: HTTP 404
  • GET /api/v1/invoices/{id} with own-tenant JWT: HTTP 200 with correct data
  • Create invoice as HR org → invoice visible to HR org, invisible to RS org querying same ID

6.4 RESTRICTIVE Policy DDL

When all §6.1 gate criteria are met, the flip is executed as Flyway migration V23 (or the next available version at execution time).

-- V23__rls_restrictive.sql (Phase 2C — post-Securion audit + 30d soak + CEO sign)
-- ZAKON: CEO Phase 2C sign required before applying.
-- This migration retires ADR-005 application-layer WHERE org_id clauses.
-- After this migration, app code may remove redundant WHERE org_id = :principalOrgId clauses
-- from service methods. Do NOT remove them before this migration runs on production.
--
-- DOWN: revert to V17 PERMISSIVE state
--   DROP POLICY org_isolation ON invoices; ... (repeat for all tables)
--   CREATE POLICY org_isolation ON invoices AS PERMISSIVE ... (restore V17 text)

-- ─── invoices ────────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON invoices;
CREATE POLICY org_isolation ON invoices
    AS RESTRICTIVE
    FOR ALL
    TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── expenses ────────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON expenses;
CREATE POLICY org_isolation ON expenses
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── transactions ─────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON transactions;
CREATE POLICY org_isolation ON transactions
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── bank_accounts ────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON bank_accounts;
CREATE POLICY org_isolation ON bank_accounts
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── bank_transactions ────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON bank_transactions;
CREATE POLICY org_isolation ON bank_transactions
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        bank_account_id IN (
            SELECT id FROM bank_accounts
            WHERE organization_id = current_setting('app.current_org_id', true)::uuid
              AND current_setting('app.current_org_id', true) IS NOT NULL
              AND current_setting('app.current_org_id', true) <> ''
              AND length(current_setting('app.current_org_id', true)) = 36
        )
    )
    WITH CHECK (
        bank_account_id IN (
            SELECT id FROM bank_accounts
            WHERE organization_id = current_setting('app.current_org_id', true)::uuid
              AND current_setting('app.current_org_id', true) IS NOT NULL
              AND current_setting('app.current_org_id', true) <> ''
              AND length(current_setting('app.current_org_id', true)) = 36
        )
    );

-- ─── accounts ─────────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON accounts;
CREATE POLICY org_isolation ON accounts
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── contacts ─────────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON contacts;
CREATE POLICY org_isolation ON contacts
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE organization_id = current_setting('app.current_org_id', true)::uuid
        END
    );

-- ─── invoice_items ────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON invoice_items;
CREATE POLICY org_isolation ON invoice_items
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        invoice_id IN (
            SELECT id FROM invoices
            WHERE organization_id = current_setting('app.current_org_id', true)::uuid
              AND current_setting('app.current_org_id', true) IS NOT NULL
              AND current_setting('app.current_org_id', true) <> ''
              AND length(current_setting('app.current_org_id', true)) = 36
        )
    )
    WITH CHECK (
        invoice_id IN (
            SELECT id FROM invoices
            WHERE organization_id = current_setting('app.current_org_id', true)::uuid
              AND current_setting('app.current_org_id', true) IS NOT NULL
              AND current_setting('app.current_org_id', true) <> ''
              AND length(current_setting('app.current_org_id', true)) = 36
        )
    );

-- ─── organizations ────────────────────────────────────────────────────────────────────────────
DROP POLICY org_isolation ON organizations;
CREATE POLICY org_isolation ON organizations
    AS RESTRICTIVE FOR ALL TO bilko_app
    USING (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE id = current_setting('app.current_org_id', true)::uuid
        END
    )
    WITH CHECK (
        CASE
            WHEN current_setting('app.current_org_id', true) IS NULL
              OR current_setting('app.current_org_id', true) = ''
              OR length(current_setting('app.current_org_id', true)) <> 36
            THEN false
            ELSE id = current_setting('app.current_org_id', true)::uuid
        END
    );

6.5 Application-Layer Cleanup After Phase 2C

After V23 applies on production, ADR-005 is formally superseded. CodeCraft may begin removing redundant WHERE org_id = :principalOrgId clauses from service methods. This is NOT required immediately — removing them is a cleanup task, not a correctness task. The dual-layer was intentional during the migration period.

Do NOT remove application-layer WHERE clauses from:

  • Test code (integration tests should still verify org scoping explicitly)
  • Admin / superuser code paths (these bypass RLS and need the WHERE clause for correctness)
  • Any query that explicitly needs cross-org access (job enumeration phase)

7. Rollback Procedure (Phase 2C)

If RESTRICTIVE policies cause production traffic degradation after Phase 2C applies:

7.1 Immediate Mitigation (< 5 minutes)

Revert RLS to PERMISSIVE without removing policies (fastest path — no Flyway involved):

-- Run as bilko_admin via Cloud SQL console or break-glass psql session:
-- Revert all RESTRICTIVE → PERMISSIVE without dropping policies

DO $$
DECLARE
    t text;
    tables text[] := ARRAY['invoices','expenses','transactions','bank_accounts',
                            'bank_transactions','accounts','contacts',
                            'invoice_items','organizations'];
BEGIN
    FOREACH t IN ARRAY tables LOOP
        EXECUTE format('DROP POLICY org_isolation ON %I', t);
    END LOOP;
END $$;

-- Restore PERMISSIVE policies from V17 DDL.
-- The V17 PERMISSIVE policy text is the source of truth.
-- Copy-paste V17 CREATE POLICY blocks here.
-- NOTE: Do NOT drop and recreate ENABLE ROW LEVEL SECURITY — leave RLS enabled.
-- NOTE: Do NOT drop FORCE ROW LEVEL SECURITY — leave force enabled.

After this is applied: RLS is back in PERMISSIVE mode. Application-layer WHERE clauses have NOT been removed yet (per §6.5 timing). The system is back to the Phase 2A dual-layer state. Services are unaffected.

7.2 Root Cause Analysis Gate

Before re-attempting Phase 2C:

  • Identify the query or code path that failed (application logs, Cloud SQL slow query log)
  • Determine if the failure was: (a) missing orgTransaction() call, (b) incorrect USING clause, (c) background job without bilko_job role, (d) monitoring query touching tenant table
  • Fix the root cause — do NOT re-attempt Phase 2C until the failure is understood and the §6.3 fail-closed test for that scenario passes
  • Run the full §6.3 checklist again from scratch
  • Reset the 30-day soak clock

7.3 Full Rollback (Flyway-based, if immediate mitigation insufficient)

If the PERMISSIVE revert does not stabilize the system (e.g., the issue is in FORCE RLS behavior rather than PERMISSIVE/RESTRICTIVE distinction):

-- Full rollback — disable RLS entirely (emergency only, CEO approval required)
ALTER TABLE invoices            DISABLE ROW LEVEL SECURITY;
ALTER TABLE invoice_items       DISABLE ROW LEVEL SECURITY;
ALTER TABLE expenses            DISABLE ROW LEVEL SECURITY;
ALTER TABLE transactions        DISABLE ROW LEVEL SECURITY;
ALTER TABLE bank_transactions   DISABLE ROW LEVEL SECURITY;
ALTER TABLE bank_accounts       DISABLE ROW LEVEL SECURITY;
ALTER TABLE accounts            DISABLE ROW LEVEL SECURITY;
ALTER TABLE contacts            DISABLE ROW LEVEL SECURITY;
ALTER TABLE organizations       DISABLE ROW LEVEL SECURITY;

DROP POLICY IF EXISTS org_isolation ON invoices;
DROP POLICY IF EXISTS org_isolation ON invoice_items;
DROP POLICY IF EXISTS org_isolation ON expenses;
DROP POLICY IF EXISTS org_isolation ON transactions;
DROP POLICY IF EXISTS org_isolation ON bank_transactions;
DROP POLICY IF EXISTS org_isolation ON bank_accounts;
DROP POLICY IF EXISTS org_isolation ON accounts;
DROP POLICY IF EXISTS org_isolation ON contacts;
DROP POLICY IF EXISTS org_isolation ON organizations;

ZAKON: Full rollback (disabling RLS entirely) requires CEO approval. It returns the system to a state where application-layer WHERE clauses are the sole isolation mechanism — the Phase 2A state is permanently lost. A new Flyway migration (re-applying V17) is required to restore it.

After full rollback, treat the situation as a new Phase 2A and restart the soak clock.


8. Alternatives Considered

Option A: Single RESTRICTIVE flip with no PERMISSIVE phase

Apply RESTRICTIVE policies immediately without Phase 2A (PERMISSIVE). Remove ADR-005 application-layer clauses at the same time.

Rejected: Any missing orgTransaction() call would result in a user seeing 0 rows (silent data loss), which is harder to debug than a cross-tenant leak. The PERMISSIVE phase ensures that missing context causes empty results that are caught in integration tests while the application layer maintains correct behavior.

Option B: FORCE ROW LEVEL SECURITY on all tables including organizations

Apply FORCE on organizations as well, so bilko_admin cannot bypass RLS on that table.

Rejected: Flyway migrations and admin tooling legitimately need cross-org access to organizations for onboarding, support, and migration operations. FORCE on organizations would require all admin queries to set app.current_org_id — creating a coupling between admin tooling and the RLS session variable mechanism. V17 explicitly omits FORCE on organizations for this reason.

Option C: Use connection-level SET instead of SET LOCAL

Set app.current_org_id at connection checkout (once per connection from the pool) rather than per-transaction via SET LOCAL.

Rejected: HikariCP connection pools reuse connections across requests. A connection-level SET would persist across requests, meaning a subsequent request on the same connection would inherit the previous request's org context if the variable is not reset. SET LOCAL scopes automatically to the transaction boundary, which aligns naturally with Exposed's transaction {} block and eliminates the race condition entirely.


9. Consequences

9.1 Positive

  1. Phase 2C can proceed independently of HR GA timeline — all gate criteria are measurable during the Phase 2A soak period.
  2. The bilko_app grant matrix removes the ambiguity around which role has which privileges. The immutability enforcement on transactions/bank_transactions adds a defence-in-depth layer against ledger tampering.
  3. The service-account separation (bilko_admin/bilko_job/bilko_app/bilko_monitor) follows least-privilege. A compromised application process cannot run Flyway migrations.
  4. The fail-closed checklist gives Securion a deterministic audit target.

9.2 Negative

  1. The bilko_job role with BYPASSRLS requires careful access control. If bilko_job credentials leak, an attacker can enumerate all org IDs.
  2. The Phase 2C RESTRICTIVE flip requires re-running the §6.3 checklist if Phase 2C is rolled back — this means a minimum of 30 additional days of soak before re-attempting.
  3. The transactions and bank_transactions UPDATE/DELETE revoke (§2.2) will cause unexpected permission errors if any future feature requires amending a posted transaction. Correction must be via compensating journal entries.

10. Approval

Architecture status: Accepted Phase 2C CEO sign status: REQUIRED SEPARATELY — Phase 2C flip gate requires explicit CEO sign at the time the §6.1 criteria are met. The ADR-017-RLS-MULTITENANCY.md sign (2026-05-11) covers Phase 2A only.

Role Sign Date
Database Architecture (Bruce Momjian) Signed 2026-05-13
Architecture Lead (Petter Graff) Pending
CEO (Alem Bašić) — Phase 2C gate Required at Phase 2C execution

11. Document History

Date Author Change
2026-05-13 Bruce Momjian Initial — MC #100586 Phase 0' forward decisions supplement to ADR-017-RLS-MULTITENANCY.md