Skip to content

ADR-0001: Application stack and multi-tenancy approach

  • Status: accepted
  • Date: 2026-04-27
  • Decider(s): Theo (SA), with platform constraint from Atlas (PE)

Update note (2026-05-02 — see ADR-0004). The "Tenancy" section's bullet 5 below references Player and player_shares — both superseded by the Person/ClientProfile split and the three-grant sharing model in ADR-0004. The chokepoint mechanism (single SQLAlchemy session-event hook, ContextVar binding, loud-failure on unbound, bypass_tenant admin escape) is unchanged; only the predicate broadens. The body of this ADR is preserved verbatim — read ADR-0004 for the current Player → Person + ClientProfile predicate.

Context

RBO V2 needs a defensible stack pick at the application level. The platform-level decisions (host, ingress, DB engine, auth runtime) are owned by Atlas and locked in platform_design_proposal.md: Caddy ingress, self-hosted Postgres 16 with PgBouncer, self-hosted gotrue, db-per-app + db-per-env. RBO consumes those.

What's still open at the application level:

  1. Web framework + ORM + migration tool for the FastAPI-shaped Python app sized at 256–512 MB on Atlas's Compose stack.
  2. Multi-tenancy approach — RBO is multi-tenant from V2 day one (Topic 1 — Stefan's call), so we have to commit to a tenancy strategy now.

These decisions are co-decided because the ORM choice constrains the tenancy enforcement mechanism.

Options

Stack

  • Option A — FastAPI + SQLAlchemy 2.0 + Alembic + Jinja2. Type-hint-driven request validation; mature ORM with the session-event hook needed for the tenancy enforcement we want; default migration toolkit; server-rendered templates fit the form-and-list domain.
  • Option B — Flask + SQLAlchemy 1.x + Alembic + Jinja2. Simpler to bootstrap. Loses FastAPI's request validation; SQLAlchemy 1.x is going to be more painful to keep current.
  • Option C — Django. Strong batteries-included story (admin, ORM, auth, migrations). But: built-in auth conflicts with gotrue (we'd ignore it); built-in admin doesn't fit the multi-tenant catalogue moderation surface as cleanly as a hand-rolled page; Django's ORM is fine but the session-event-hook story for tenancy enforcement is less idiomatic.
  • Option D — Node/Express + Prisma OR Go + sqlc. No reason to switch ecosystems away from the V1 Python heritage. Stefan has Python smell on V1; the migration ETL is Python-shaped (openpyxl); the keystone platform's CSD/ALJ are Node — RBO being Python keeps language diversity intentional rather than accidental.

Multi-tenancy

  • Row-level tenancy with stringer_id FK on every tenant-scoped table, enforced at the ORM session layer (a single chokepoint). Visibility-scoped on catalogues; share-augmented on Players via player_shares.
  • Schema-per-tenant. Multiplies migration work. Postgres schemas aren't really designed for sub-tenants.
  • DB-per-tenant. Conflicts with Atlas's locked db-per-app policy. Operationally absurd at our 1–5 stringer scale.

Decision

Stack: FastAPI + SQLAlchemy 2.0 + Alembic + Jinja2 + WeasyPrint + Tailwind

Option A. Rationale:

  • Simplest defensible choice given Atlas's platform. Postgres 16 + Python container + JWT-from-gotrue → FastAPI is the lowest-friction binding.
  • Type-hint-driven validation removes a class of bugs that hand-rolled validation in Flask would re-introduce.
  • SQLAlchemy 2.0's do_orm_execute event hook is the literal mechanism we need for the tenancy chokepoint (next section). Django's ORM and Prisma don't expose a comparable single-hook surface as cleanly.
  • Jinja2 + Tailwind + HTMX covers the UI without an SPA framework — domain is form-and-list CRUD with one workflow ("copy last order"), not an interactive SPA.
  • WeasyPrint for receipts — locked separately in ADR-0002.

Tenancy: row-level, enforced at the ORM session layer

The chokepoint:

  1. Every tenant-scoped model declares a __tenant_column__ = "stringer_id" (or inherits from TenantBase).
  2. Request middleware binds current_stringer_id into a ContextVar after JWT validation.
  3. do_orm_execute event hook intercepts every query, appends AND <model>.stringer_id = :current_stringer_id, and refuses queries when current_stringer_id is unbound.
  4. Catalogue tables (Racket, String) get the additional (created_by_stringer_id = :me OR visibility = 'shared') filter layered on top.
  5. Player gets the additional (stringer_id = :me OR id IN (SELECT player_id FROM player_shares WHERE target_stringer_id = :me)) filter.
  6. Admin role can opt into a bypass_tenant=True session attribute; this is logged.

See docs/architecture/auth-and-tenancy.md for the full mechanics.

Consequences

Good

  • Single migration path, single backup, easy admin views — all benefits of row-level over DB-per-tenant survive.
  • stringer_id filtering is automatic; no developer can forget it without dropping out of the ORM.
  • Stack is boring, stable, well-documented; future agents pick it up in one read.
  • Schema design is portable enough that if Stefan ever wants to swap the tenancy strategy (e.g. to Postgres RLS), the application code doesn't change — only the chokepoint.

Costs we accept

  • stringer_id filtering must be enforced universally — every tenant-scoped table gets the FK NOT NULL, and every read goes through the chokepoint. The cross-tenant player_shares is the only deliberate breach. Admin's bypass_tenant is the only deliberate hole, and it's logged.
  • Risk: silent data leak if a query escapes the chokepoint. Mitigated by ADR-0003's integration test asserting every tenant-scoped table refuses unscoped reads.
  • FastAPI + SQLAlchemy is a hand-assembly stack vs. Django's bundled approach. Cost is some boilerplate; benefit is the precise tenancy hook.
  • Type-hint-driven Python is more rigorous but slightly slower to write than Flask's loose style. Acceptable.

Cross-references