Skip to main content

Test Strategy

Test Strategy

Project: Bilko Version: 0.1 Date: 2026-02-23 Author: Ops Architect Status: Draft Reviewers: Tech Lead, Alem Bašić

Document History

Version Date Author Changes
0.1 2026-02-23 Ops Architect Initial draft

1. Testing Philosophy & Principles

Financial software has a higher correctness bar than typical web apps. A bug in VAT calculation or double-entry bookkeeping is not a UX inconvenience — it's a compliance failure that could expose Bilko users to tax liability or audit findings.

Core Principles:

  1. Financial logic is P0 — VAT calculations, double-entry balance, NUMERIC precision are tested at >95% coverage before any feature ships
  2. Tests are first-class code — reviewed, maintained, and refactored alongside production code
  3. Test the behavior, not the implementation — tests enable safe refactoring of internals
  4. Fast feedback — unit tests run in < 3 min; full suite < 10 min
  5. No test = no ship — financial logic without a test is a P0 blocker for merging
  6. Isolation — every test cleans up after itself; no test depends on another

Testing philosophy: Bilko follows the test pyramid — heavy unit test coverage of financial calculations and business logic, targeted integration tests for API + database, and E2E tests for the 4 critical user journeys (invoice, expense, report, auth). We do not aim for 100% E2E coverage.


2. Test Pyramid

         /\
        /E2E\        ← 10% — 12 tests — Playwright
       /------\
      /  Integ \     ← 30% — 35 tests — Supertest
     /----------\
    /    Unit    \   ← 60% — 45 tests — Vitest
   /--------------\

Distribution (92 total planned — see TEST-INVENTORY.md):

  • 60% Unit Tests (45) — Financial logic, utilities, auth
  • 30% Integration Tests (35) — API endpoints, database, org-scoping
  • 10% E2E Tests (12) — Invoice, expense, report, auth flows

3. Testing Tools

Type Tool Version Purpose Config
Unit testing Vitest Latest Business logic, utilities vitest.config.ts
Mocking Vitest built-in Mock external deps (no real DB) Built-in
Integration testing Supertest Latest API endpoint testing with real PG apps/api/src/test/setup.ts
Test database PostgreSQL 15 15 Real database for integration tests .env.test
E2E testing Playwright Latest Browser automation, user flows apps/e2e/playwright.config.ts
Coverage c8 (Vitest built-in) Coverage reports vitest.config.ts
Performance k6 Latest Load testing (PLANNED Phase 2) apps/e2e/load/

Why Vitest (not Jest)

  • ESM native, Vite-based → faster
  • Compatible with Turborepo
  • Watch mode with HMR
  • Same API as Jest (easy migration)

Why Playwright (not Cypress)

  • Multi-browser: Chromium, Firefox, WebKit (Safari)
  • Auto-wait (no flaky tests from race conditions)
  • Parallel execution (workers: 4)
  • Video and trace on failure

4. Test Scope by Layer

4.1 Unit Tests (Vitest)

Attribute Value
Scope Pure functions: VAT calculation, double-entry validation, currency conversion, invoice totals, date utils, number formatting
External dependencies Mocked — no real DB, network, or filesystem
Coverage target > 95% for financial logic; > 90% utilities; > 80% services; > 80% overall
Execution time < 3 minutes
Runs on Every commit, pre-commit hook (lint + type-check only), CI on every push
Written by Developer who writes the feature

What to unit test:

  • calculateVAT(amount, rate, country) — Serbia 20%, BiH 17%, Croatia 25%
  • validateDoubleEntry(debit, credit) — must be equal, error on imbalance
  • convertCurrency(amount, fromCurrency, toCurrency, exchangeRate) — NUMERIC(19,4)
  • calculateInvoiceTotal(items) — subtotal, tax, discount, total
  • lockExchangeRate(date, fromCurrency, toCurrency) — historical rate, not today's

What NOT to unit test:

  • Prisma ORM internals
  • Express framework boilerplate
  • Simple property getters/setters with no logic

4.2 Integration Tests (Supertest)

Attribute Value
Scope All API routes with real PostgreSQL 15 database
External dependencies Real PostgreSQL (test container in CI, bilko_test DB local)
Coverage target All service boundaries; > 80% of integration paths
Execution time < 5 minutes
Runs on Every PR, blocking merge
Written by Developer who writes the API endpoint

What to integration test:

  • Auth flow (register, login, refresh, logout)
  • Invoice CRUD + status transitions (draft → sent → paid)
  • Expense CRUD + approval flow
  • Reports API (P&L, VAT, balance sheet)
  • Organization scoping — org A cannot read org B's data (P0 security test)
  • RBAC enforcement — viewer cannot create, owner can delete

4.3 E2E Tests (Playwright)

Attribute Value
Scope 4 critical user journeys through deployed application
External dependencies Real (staging environment or production)
Coverage target 4 critical journeys + 8 sub-scenarios
Execution time < 8 minutes
Runs on Post-staging deploy, pre-production gate
Written by Developer + QA collaboration

Critical journeys:

  1. Invoice Flow: Create (6-step wizard) → Preview → Send → Mark Paid
  2. Expense Flow: Add → Upload Receipt → Approve → Pay
  3. Report Flow: Generate P&L → Export PDF
  4. Auth Flow: Register → Login → 2FA → Logout

5. Test Data Management

Approach Used For Tool Cleanup
Test factories Unit + integration apps/api/src/test/factories/ Per-test (beforeEach teardown)
Database seeding E2E tests packages/database/prisma/seed.ts Per E2E run
PostgreSQL transactions Integration tests Prisma $transaction rollback Per test

Isolation rule: beforeEach in integration tests clears all tables via Prisma deleteMany() cascade.

Test org pattern: Each integration test creates a fresh bilko_test organization and user to prevent cross-test contamination.


6. Coverage Requirements

Layer Lines Branches Functions Enforcement
Financial logic (VAT, double-entry, currency) ≥ 95% ≥ 90% ≥ 100% CI hard fail
Authentication utils ≥ 95% ≥ 90% ≥ 100% CI hard fail
API handlers ≥ 80% ≥ 75% ≥ 80% CI hard fail
Utilities ≥ 90% ≥ 85% ≥ 90% CI hard fail
Overall minimum ≥ 80% ≥ 75% ≥ 80% CI hard fail

Coverage enforcement: Vitest coverage thresholds in vitest.config.ts. CI pipeline fails if below threshold.


7. Quality Gates

PR Merge Gate

  • All unit tests pass
  • All integration tests pass
  • Coverage ≥ minimum thresholds
  • Linting passes (ESLint + Prettier)
  • Type checking passes (TypeScript strict)
  • No new HIGH/CRITICAL security findings

Staging Deploy Gate

  • All PR gates passed
  • Build artifact created successfully

Production Deploy Gate

  • All E2E tests pass on staging
  • Performance baseline not degraded > 20%
  • Manual approval in CI pipeline

8. Responsibility Matrix

Test Type Writes Reviews Maintains Signs Off
Unit tests Developer PR reviewer Developer Tech Lead
Integration tests Developer QA / Tech Lead Developer Tech Lead
E2E tests Developer Tech Lead Developer Tech Lead
Performance tests DevOps Tech Lead DevOps Alem Bašić

9. Test Reporting & Metrics

Metric Target
Test pass rate ≥ 99% unit, ≥ 95% E2E
Flaky test rate < 2%
Full suite execution time < 10 min
Coverage trend Stable or improving per sprint
Financial logic coverage ≥ 95% at all times

10. Continuous Testing in CI/CD

Stage Tests Run Blocking
Pre-commit (local) lint + type-check only Recommended (Husky)
PR open/update unit + integration + lint + type-check Yes — blocks merge
Staging deploy E2E (Playwright, 3 browsers) Yes — blocks production
Production deploy Smoke tests Yes — auto-rollback on failure
Nightly (PLANNED) Full E2E suite + performance No — alerts only


Approval

Role Name Date Signature
Author Ops Architect 2026-02-23
Reviewer Tech Lead
Approver Alem Bašić