Architecture
QODY Architecture
System Context
Three independently deployable micro-frontends (MFE) talk to one Ktor API. The API owns Postgres, emits domain events to an internal bus, fans real-time updates out over WebSocket/SSE, reads feature flags from Unleash, and talks to a payment provider via webhooks.
Component Diagram
graph TB
subgraph Clients
G["Guest MFE<br/>(QR menu, cart, pay)<br/>public, no-login"]
S["Staff/Kitchen MFE<br/>(KDS, order board)<br/>JWT staff"]
A["Admin MFE<br/>(venue dashboard,<br/>menu editor, plans)<br/>JWT admin"]
end
subgraph Edge
CDN["CDN / static host<br/>per-MFE bundles"]
GW["Reverse proxy / API gateway<br/>(TLS, CORS, rate-limit,<br/>public /guest carve-out)"]
end
subgraph Backend["Ktor API (Kotlin)"]
R["Route groups:<br/>/guest /staff /admin /webhooks /health"]
SVC["Domain services<br/>(Order, Menu, Session,<br/>Payment, Tenant)"]
EVT["Event bus<br/>(in-proc -> Postgres outbox<br/>-> upgradeable to Kafka)"]
RT["Real-time hub<br/>(WebSocket + SSE fallback)"]
FF["Unleash client<br/>(per-venue/per-plan flags)"]
end
DB[("PostgreSQL 16<br/>RLS tenant isolation<br/>Flyway migrations")]
PAY["Payment provider(s)<br/>Stripe / market-specific"]
UNL["Unleash server"]
OBS["Sentry + structured logs<br/>+ /health"]
G --> CDN
S --> CDN
A --> CDN
G --> GW
S --> GW
A --> GW
GW --> R
R --> SVC
SVC --> DB
SVC --> EVT
EVT --> RT
EVT --> DB
RT -. "live order/table updates" .-> S
RT -. "table status" .-> G
SVC --> FF
FF --> UNL
SVC --> PAY
PAY -- "webhook (signed)" --> R
SVC --> OBS
Why These Boundaries
- One API, three MFEs. The MFE split is about deploy cadence and blast radius, not about microservices. Guest menu changes ship hourly; the admin dashboard ships weekly. A bug in the menu editor must never take down table ordering.
- Event bus starts in-process with a Postgres transactional outbox. Order state transitions write the state change AND the outbox row in the same DB transaction (no lost events, no dual-write inconsistency). A dispatcher drains the outbox to the real-time hub. When a venue chain needs cross-service scale, the outbox drains to Kafka instead.
- Real-time hub = WebSocket with SSE fallback. Kitchen display systems (KDS) sit on venue Wi-Fi that is hostile (NAT, captive portals, flaky AP roaming). Design for failure: heartbeat + auto-reconnect + on-reconnect state resync.
Multi-Tenancy Model
Tenant = Venue. A Tenant/Organization may own multiple Venues for chains; the RLS scope key is venue_id, with an optional org_id parent for chain-level admin.
Per ALAI database rules DB-05/DB-06: every tenant-scoped table carries venue_id UUID NOT NULL and RLS is ENABLED + FORCED.
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders FORCE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
USING (venue_id = current_setting('app.current_venue_id', true)::uuid);
CREATE POLICY tenant_insert ON orders
AS RESTRICTIVE FOR INSERT
WITH CHECK (venue_id = current_setting('app.current_venue_id', true)::uuid);
The Ktor layer sets SET app.current_venue_id = '<uuid>' at connection checkout (HikariCP) inside the request/transaction scope, and resets it on release. Stale tenant context on a pooled connection is a silent cross-venue data breach.
Bilko RLS Lesson — Hard Requirement (Tool-Verified 2026-06-19)
The most expensive Bilko bug was NOT a missing policy. It was that the application DB role had the BYPASSRLS attribute, which silently overrides FORCE ROW LEVEL SECURITY — RLS looked configured but isolated nothing. Mandatory for QODY:
- The app connects as a dedicated role (e.g.
qody_app) that MUST NOT haveBYPASSRLSand MUST NOT be the table owner. - Migrations/owner DDL run as a separate privileged role used only by Flyway, never by the running app.
- CI startup-validation query (fail-closed) on every boot:
SELECT rolname, rolbypassrls FROM pg_roles WHERE rolname = 'qody_app'; -- must return rolbypassrls = false, or the app refuses to start - RLS isolation E2E test (Proveo): create two venues, set context to venue A, assert venue B's orders are invisible AND uninsertable.
Guest Path Special-Casing
The guest MFE is anonymous (no JWT). The guest still must be scoped to one venue+table. Scoping comes from the signed QR token, not from a login. The API resolves the QR token to venue_id/table_id server-side, sets RLS context from that, and the guest can only ever touch their own table's open session. Guest endpoints are explicitly carved out of auth at the gateway (a tight /guest/* allowlist).
Core Domain Model
UUID PKs, NUMERIC(19,4) money, TIMESTAMPTZ, deleted_at soft delete, version optimistic lock on mutable entities, venue_id + RLS on all tenant tables.
erDiagram
ORGANIZATION ||--o{ VENUE : owns
VENUE ||--o{ TABLE : has
VENUE ||--o{ MENU : publishes
VENUE ||--o{ STAFF : employs
MENU ||--o{ CATEGORY : contains
CATEGORY ||--o{ MENU_ITEM : lists
MENU_ITEM ||--o{ MODIFIER_GROUP : has
MODIFIER_GROUP ||--o{ MODIFIER : offers
TABLE ||--o{ TABLE_SESSION : hosts
TABLE_SESSION ||--o{ ORDER : groups
ORDER ||--o{ ORDER_LINE : contains
ORDER_LINE ||--o{ ORDER_LINE_MODIFIER : applies
ORDER ||--o{ PAYMENT : settled_by
STAFF }o--|| ROLE : assigned
Key Entities
| Entity | Purpose | Key Fields |
|---|---|---|
organization |
Chain owner (optional parent) | id, name, plan_tier |
venue |
The tenant boundary | id, org_id, name, slug, branding(jsonb), timezone, currency, plan_tier |
restaurant_table |
Physical table | id, venue_id, label, qr_token_id, capacity |
menu |
Versioned menu for a venue | id, venue_id, name, is_active, valid_from/until |
menu_item |
Sellable item | id, category_id, venue_id, name, description, price NUMERIC(19,4), tax_rate, allergens(jsonb) |
table_session |
One sitting at a table | id, venue_id, table_id, status, opened_at, closed_at |
order |
A submission within a session | id, venue_id, table_session_id, status, subtotal, tax_total, tip_amount, total, version |
order_line |
Line in an order | id, order_id, venue_id, menu_item_id, qty, unit_price, line_total, note, status |
payment |
Settlement attempt/record | id, venue_id, order_id, provider, provider_ref, amount, currency, status, idempotency_key |
Money/price snapshotting. order_line.unit_price and order_line_modifier.price_delta_snapshot are copied at order time. The menu price can change tomorrow; what the guest agreed to pay is frozen on the line.
Branding lives in venue.branding (jsonb: logo, colours, accent) so white-labeling is a data concern, not a build concern.
Order Lifecycle
States are explicit and enforced server-side (a state machine). Illegal transitions are rejected, not silently ignored. Every transition writes a row to the transactional outbox → real-time hub.
stateDiagram-v2
[*] --> SESSION_OPEN: QR scan resolves token -> open/attach TableSession
SESSION_OPEN --> CART: guest adds items (client-side draft, server-validated)
CART --> SUBMITTED: guest submits order (server validates availability + price + flags)
SUBMITTED --> ACCEPTED: staff/kitchen accepts (or auto-accept flag)
ACCEPTED --> IN_PREP: kitchen starts
IN_PREP --> READY: kitchen marks ready
READY --> SERVED: waiter serves
SERVED --> PAID: payment captured (pay-now or pay-at-end)
PAID --> CLOSED: session settled, table freed
SUBMITTED --> CANCELLED: staff/guest cancels pre-accept
ACCEPTED --> CANCELLED: staff cancels (with reason)
CLOSED --> [*]
Real-Time Propagation
SUBMITTEDevent → appears instantly on Kitchen MFE order board (the demo "wow" moment)IN_PREP/READY→ guest sees their order status on the table; waiter sees "ready for pickup"SERVED/PAID/CLOSED→ table status flips to free on the Staff MFE floor view
Payment Timing
Payment timing is a venue setting (flag-gated):
- Pay-per-order (fast casual / bar): each order pays immediately; SUBMITTED → PAID may precede kitchen
- Pay-at-end (table service): orders accumulate on the table_session; one settlement at the end
Idempotency. Payment captures and webhook handlers use payment.idempotency_key. A retried Stripe webhook must never double-charge or double-advance state.
Reconnect resync. On KDS reconnect the client calls GET /staff/orders?status=open and rebuilds its board from authoritative state.
API Surface (Ktor Route Groups)
/health GET liveness/readiness (MUST), RLS-role self-check
# ---- GUEST (public, scoped by signed QR token, no JWT) ----
/guest/resolve POST { qrToken } -> { venueId, tableId, sessionId, branding }
/guest/menu GET active menu for resolved venue
/guest/session/{id} GET current session + my orders + live status
/guest/cart/validate POST server-side price/availability/flag re-check
/guest/order POST submit order (idempotency key) -> SUBMITTED
/guest/payment/intent POST create payment intent
/guest/payment/confirm POST confirm/capture
/guest/stream GET SSE: my order/table status updates
# ---- STAFF / KITCHEN (JWT staff, role-gated) ----
/staff/auth/login POST email+password -> JWT
/staff/orders GET open orders board
/staff/orders/{id}/accept POST SUBMITTED -> ACCEPTED
/staff/orders/{id}/prep POST ACCEPTED -> IN_PREP
/staff/orders/{id}/ready POST IN_PREP -> READY
/staff/orders/{id}/serve POST READY -> SERVED
/staff/sessions/{id}/close POST settle + free table -> CLOSED
/staff/stream WS live order events (KDS)
# ---- ADMIN / VENUE DASHBOARD (JWT admin/owner) ----
/admin/venues CRUD venue + branding
/admin/tables CRUD tables + QR token (re)generation
/admin/menus CRUD menu/category/item/modifier
/admin/staff CRUD staff + roles
/admin/reports GET sales/orders summaries
# ---- WEBHOOKS (signature-verified) ----
/webhooks/payment/{provider} POST signed payment events
Feature-Flag Map (Unleash)
Same pattern as Bilko feature-enable (MC #102481): the plan tier drives a set of Unleash flags; flags are evaluated with a venue context so a flag can also be force-toggled for a single venue (pilot, demo, A/B).
| Capability | Flag key | Basic | Pro | Enterprise |
|---|---|---|---|---|
| QR menu + order + pay (core) | always-on | ✓ | ✓ | ✓ |
| Kitchen display (KDS real-time) | kds.realtime |
✓ | ✓ | ✓ |
| Multi-language menu | menu.multilang |
– | ✓ | ✓ |
| Tipping at checkout | pay.tipping |
– | ✓ | ✓ |
| Split bill | pay.splitbill |
– | ✓ | ✓ |
| Pay-at-end (table tab) | pay.payatend |
– | ✓ | ✓ |
| AI upsell / recommendations | ai.upsell |
– | – | ✓ |
| White-label theming | brand.whitelabel |
– | ✓ | ✓ |
| Chain dashboard | chain.dashboard |
– | – | ✓ |
Backend gates the capability so a flag is a real security/contract boundary, not just a UI hide. The MFE hides UI; the API enforces.
Architectural Non-Negotiables
qody_appDB role MUST NOT have BYPASSRLS and MUST NOT own tables; fail-closed startup check.- RLS ENABLED + FORCED on every tenant table;
app.current_venue_idset at checkout, reset on release. - Money is
NUMERIC(19,4), snapshotted on order lines; never recomputed from live catalogue. - Order state machine is server-enforced; illegal transitions rejected; transitions emit via transactional outbox.
- Real-time is an optimization over an authoritative DB; clients resync on reconnect.
- Payment webhooks signature-verified + idempotent; never double-charge/double-advance.
- Capabilities enforced at the API (flag = contract boundary), not just hidden in the MFE.
- Deploy verification per ZAKON PI2 — verify the new revision actually serves 100%.
- Distribute only proven seams. Start in-process; earn Kafka/microservices, do not anticipate them.
No comments to display
No comments to display