Bilko Backoffice — Backend MVP (Sentry + audit request_id + support_tickets)

1. Overview

This backend slice (MC #103323, branch feat/103323-backoffice-backend, commit 6b214a00, PR #316) delivers the diagnostic and intake backbone for the Bilko support fix-loop.

Before this slice, when a customer hit an accounting error on app.bilko.cloud neither the platform team nor the customer had a way to identify which request failed or why. Three components address that gap:

  1. Sentry error capture — catch-all (INFRA-only) exception capture with PII scrub and Cloud Run release/serverName metadata. Inert until SENTRY_DSN secret is provisioned (OCD-1, CEO action).
  2. V71 audit_log.request_id — nullable correlation column added to every audit row, threaded from a single canonical source (call.callId) across all route handlers.
  3. V72 support_tickets + SupportTicketRoutes — customer intake channel (POST) and platform-admin triage queue (GET list + GET detail + PATCH status) with RLS, idempotency, and full status-transition audit trail.

This slice is deploy-gated behind prod cutover MC #103300. All three components were independently verified by Proveo (Angie Jones): 12/12 AC signals PASS, integration test 3/3 PASS, unit suite 1280/1280.

2. Component Map

2.1 Sentry capture — plugins/Sentry.kt + plugins/StatusPages.kt

2.2 V71 audit_log.request_id — AuditLogService.kt + migration

2.3 V72 support_tickets + SupportTicketRoutes — routes/SupportTicketRoutes.kt

3. Data Model

3.1 support_tickets columns

ColumnTypeNotes
idUUID PKgen_random_uuid() default
org_idUUID NOT NULLFK to organizations(id) ON DELETE CASCADE
user_idUUID NOT NULLFK to users(id)
error_codeTEXTNullable; currently generic VAL/INFRA (OCD-2 open CEO decision)
request_idTEXTCorrelation ID of originating failed request. NOT a FK to audit_log.request_id (one request_id maps to N audit rows). Join via equality.
context_bundleJSONB NOT NULLCHECK jsonb_typeof = 'object'. Allowlisted keys only (server-side enforced).
customer_descriptionTEXTFree text from customer
statusTEXT NOT NULLCHECK (status IN ('OPEN','TRIAGED','IN_PROGRESS','RESOLVED','CLOSED')). Default 'OPEN'.
triage_jsonJSONBNULL = not yet triaged. V2 AI agent writes here.
created_atTIMESTAMPTZ NOT NULLDEFAULT now()
updated_atTIMESTAMPTZ NOT NULLDEFAULT now(); maintained by BEFORE UPDATE trigger.
resolution_noteTEXTRequired (route-enforced) for RESOLVED/CLOSED transitions.
external_refTEXTV2 Zendesk/Linear sync. Nullable at MVP.

3.2 Indexes

3.3 RLS policies

All GUC SET statements use SET LOCAL (transaction-scoped) — pgBouncer transaction-mode pooling safe.

Customer UPDATE/DELETE immutability: no UPDATE or DELETE policy for customers. RLS ENABLED with no such policy = deny-by-default. Customers cannot modify or delete submitted tickets.

Production code audit (Proveo-confirmed): orgTransaction{} (OrgScopeSessionVariable.kt:131) always wraps SET LOCAL app.current_org_id inside transaction{}. The Testcontainers test failure (Proveo GAP-1) was caused by the test setup using a PostgreSQL superuser connection — superusers bypass RLS regardless of GUC values. Production code was never buggy.

3.4 Status transition machine

FromAllowed next states
OPENTRIAGED, CLOSED
TRIAGEDIN_PROGRESS, CLOSED
IN_PROGRESSRESOLVED, CLOSED
RESOLVEDCLOSED
CLOSED(no further transitions)

Invalid transitions return HTTP 422 with code: "INVALID_TRANSITION" and allowedNext.

3.5 context_bundle allowlist

Allowed keys (server-side enforced, rejection = HTTP 422): requestId, errorCode, httpStatus, instancePath, orgId, userId, appRoute, planTier, country, auditRef. IDs and codes only — never invoice content, names, amounts, or email addresses.

4. Diagnostic Join

SELECT al.*
FROM audit_log al
JOIN support_tickets st ON al.request_id = st.request_id
WHERE st.id = '<ticket-uuid>';

Framing (martin-kleppmann dissent): request_id is a correlation handle for cross-system debugging only — NOT tamper-evidence. The append-only guarantee for audit_log comes from the block_audit_mutation() trigger (V51), not from request_id. Platform-admin direct DB access is outside the threat model of this column.

5. Known Gaps and Follow-ups

ItemDetailStatus
OCD-1: Sentry DSN bilko-sentry-dsn / bilko-web-sentry-dsn must be provisioned in GCP Secret Manager. Inject via --update-secrets (never --set-env-vars). Sentry code is fully inert until then. CEO action required. Blocks production deploy; does not block feature branch merge.
OCD-2: error_code taxonomy Domain errors currently fall into generic VAL/INFRA codes, making ticket triage partly blind. Domain-specific codes are V2 scope (MC #103333). CEO confirmed proceed with V72 before those codes land. Open CEO decision. V2 follow-on MC #103333.
OCD-3: merge-order vs #103300 V71/V72 migration numbers must be confirmed/renumbered after #103300 merges. Open. Blocking deploy only.
Positive-path RLS assertion Integration test confirms negative proof (wrong-org INSERT rejected). Positive proof (correct-org INSERT succeeds) not explicitly asserted. Proveo: completeness gap, not safety-weakening gap. Follow-up test enhancement. Non-blocking.
CI runner quota Tracked as MC #103304. Separate MC.
Deploy gate Deploy-gated behind MC #103300 prod cutover. Dependent on #103300.

6. Verification Evidence


Revision #1
Created 2026-06-09 23:30:38 UTC by John
Updated 2026-06-09 23:30:38 UTC by John