Skip to content

ADR-0005: In-app notification model

  • Status: Accepted
  • Date: 2026-05-02
  • Decider(s): Theo (SA), with defaults pre-confirmed by Stefan and OQ-3 closed by Iris
  • Closes: #90

Context

OQ-3 — closed 2026-05-02 in the requirements log (entry CC-2026-05-02-2) — made the in-app notification inbox mandatory in V2. The exact wording: "in-app mandatory + email opt-in via Person.notification_prefs."

Several existing decisions reference this surface but cannot proceed without a schema:

  • ADR-0004 §Notification hooks — names two trigger points (originating-stringer notified when a Rule-3 grant exposes a new job; grantee/granter notifications on grant create/revoke) and explicitly defers delivery to V3 while leaving the V2 in-app surface as the immediate consumer of these triggers.
  • ADR-0007 §Notification hooks — names lifecycle-transition trigger points (strung, returned, paid) wanting in-app notifications.
  • NFR-8 + UC-7 — admin moderation queue with unread-badge counts for catalogue_submissions. Same surface.
  • v3-vision §C2 — the V3 dispatch pipeline derives state from the V2 notification rows; it must read the same table.

Without this ADR, Pax cannot scaffold the inbox / badge / per-trigger write hook; the chokepoint cannot apply a tenancy predicate to a non-existent table; and ADR-0004 / ADR-0007 references dangle.

Stefan's pre-baked defaults (carried in here as locked):

  • Dedicated notifications table with read-state.
  • No worker — write-on-event in the same transaction as the triggering write.
  • One row per (recipient, event); per-stringer-scoped via recipient_stringer_id.

The remaining knobs are decided here with documented defaults; the few that need Stefan-confirm are flagged in Open questions.

Options

Storage shape

  • (S-1) Dedicated notifications table per recipient. (chosen) Structured rows, queryable, read-state column. Aligns with OQ-3 close.
  • (S-2) Reuse share_audit + order_state_audit as the inbox. The audit tables capture what happened; an inbox is what the user has not yet seen. Read-state is per-recipient (the same event might be "unread" for one stringer and "read" for another). Storing read-state on the audit row pollutes the audit's append-only invariant. Rejected.
  • (S-3) JSONB blob per stringer. Worse — mutating one notification rewrites the whole blob; no query path; no concurrent safety. Rejected.

Dispatch model

  • (D-1) Write-on-event in the same transaction as the triggering write. (chosen, Stefan default) Atomic — the order is committed iff the notification is committed. No worker. PgBouncer-friendly (no LISTEN/NOTIFY, no advisory locks, no background process).
  • (D-2) Outbox pattern (write to outbox, async worker dispatches). Adds a worker process; introduces the worker's failure modes; not justified at V2 volume. Defer to V3 dispatcher.
  • (D-3) Postgres LISTEN/NOTIFY. Hostile to PgBouncer transaction-pooling. Rejected.

Recipient model

  • (R-1) Single recipient_stringer_id FK. (chosen for V2) All V2 notifications are for stringers (admin moderation, share-grant alerts, lifecycle alerts to the owning stringer). Simplest; aligns with the chokepoint's current_stringer_id.
  • (R-2) Polymorphic recipient (recipient_kind + recipient_id). Premature; no V2 trigger writes to a Person. V3 will add Person-recipient triggers (client-portal alerts) — the migration is one new column + one new enum value. Defer.
  • (R-3) Multi-recipient via a join table. Over-engineered for a V2 where every event has exactly one stringer recipient. Rejected.

Read-state shape

  • (RS-1) read_at timestamptz, nullable. (chosen) NULL = unread. One column carries both the boolean and the when-was-it-read. Cheap to query; cheap to update.
  • (RS-2) Separate read boolean. Loses the when-was-it-read information for free. Rejected.
  • (RS-3) Separate notification_reads join table. Premature for single-recipient model. Rejected.

Notification kinds

  • (K-1) Locked enum, exhaustive at write time. (chosen for V2) Ten or so kinds; unknown kinds are a bug, not an extension point. Adding a kind in V3 is one Alembic migration adding the enum value.
  • (K-2) Free-text kind string. Loses CHECK-constraint validation; opens the door to typos producing silent dropped notifications. Rejected.
  • (K-3) Polymorphic kind + per-kind table. Over-engineered. Rejected.

Decision

notifications schema

notifications
  id                    BIGSERIAL PRIMARY KEY
  recipient_stringer_id BIGINT NOT NULL REFERENCES stringers(id)
  kind                  notification_kind NOT NULL  -- enum, see below
  subject_kind          notification_subject_kind NOT NULL  -- enum
  subject_id            BIGINT NOT NULL  -- FK by convention to whatever subject_kind names
  actor_kind            notification_actor_kind NOT NULL  -- enum: stringer | person | system | admin
  actor_id              BIGINT NULL  -- nullable for system events
  payload               JSONB NOT NULL  -- kind-specific structured detail (small; bounded)
  dedup_key             TEXT NULL  -- for idempotent re-fires; unique partial index below
  read_at               TIMESTAMPTZ NULL  -- NULL = unread
  created_at            TIMESTAMPTZ NOT NULL DEFAULT now()
  -- V3 hand-over slot (NULL in V2; V3 dispatcher writes here)
  dispatch_attempted_at TIMESTAMPTZ NULL
  dispatch_succeeded_at TIMESTAMPTZ NULL
  dispatch_meta         JSONB NULL

Indexes:

  • notifications(recipient_stringer_id, read_at) WHERE read_at IS NULL — the unread-badge query and the inbox-unread filter both hit this. Partial index keeps it tiny.
  • notifications(recipient_stringer_id, created_at DESC) — supports the inbox listing query.
  • notifications(recipient_stringer_id, dedup_key) WHERE dedup_key IS NOT NULL — partial unique index for idempotent re-fires (see Idempotency).

Enums:

notification_kind (V2 closed set):

Kind Trigger Subject Source ADR / issue
order_strung Order's strung_at set Order ADR-0007
order_returned Order's returned_at set Order ADR-0007
order_paid Order's paid_at set Order ADR-0007
order_admin_overridden Admin bypass-edit on a sealed order Order ADR-0007
share_granted_to_me Someone grants me a Rule-#1 / #2 / #3 share OrderShare or PersonStringerShare ADR-0004
share_revoked_from_me Grant I held is revoked OrderShare or PersonStringerShare ADR-0004
my_order_visible_to_third_party Rule-#3 grant exposes one of my new orders to a third stringer Order ADR-0004
catalogue_submission_pending New CatalogueSubmission enters pending (admin recipient) CatalogueSubmission NFR-8 / UC-7 / integrations.md
catalogue_submission_decided Admin promotes/rejects my submission (stringer recipient) CatalogueSubmission NFR-8 / UC-7
receipt_email_failed Resend send failed; in-app surface so stringer notices Order integrations.md + #44

notification_subject_kind: order | order_share | person_stringer_share | catalogue_submission.

notification_actor_kind: stringer | person | system | admin.

Tenancy: notifications is per-stringer-scoped via recipient_stringer_id. Per auth-and-tenancy.md, the chokepoint applies the predicate recipient_stringer_id = current_stringer_id automatically. No grant-based cross-tenant access — a notification is read by exactly one stringer. (V3 client-portal Person-recipients will add a parallel recipient_person_id column with a CHECK that exactly one of the two recipient FKs is non-null.)

Writes: the chokepoint refuses any UPDATE on a notification where recipient_stringer_id != current_stringer_id. The only legitimate UPDATE is "mark as read" — narrow allow-list (see Inbox API below).

Payload size: soft-cap 4KB per row (warn in CI if templates produce larger). Anything bigger should be a join to the subject table at render time, not embedded.

Inbox query (V2 reference)

-- Inbox listing — most-recent N, unread first then recently-read
SELECT id, kind, subject_kind, subject_id, actor_kind, actor_id,
       payload, read_at, created_at
  FROM notifications
 WHERE recipient_stringer_id = :me
 ORDER BY (read_at IS NULL) DESC, created_at DESC
 LIMIT 50;

Pagination is offset-based for V2 (50/page; the inbox is small). If a stringer ever has > a few thousand notifications, switch to keyset (WHERE created_at < :cursor). Defer.

Unread-badge query

SELECT count(*) AS unread
  FROM notifications
 WHERE recipient_stringer_id = :me
   AND read_at IS NULL;

Hits the partial index (recipient_stringer_id, read_at) WHERE read_at IS NULL — single index lookup; sub-millisecond at any realistic V2 volume. Performance cap: if a stringer's unread count exceeds 999, the badge displays 999+ and the count query is replaced by SELECT 1 FROM ... LIMIT 1000. Avoids unbounded scans.

Mark-as-read API

Two endpoints (the only legitimate UPDATE paths):

  • POST /notifications/{id}/read — sets read_at = now() if NULL, no-op if already set.
  • POST /notifications/read-allUPDATE ... SET read_at = now() WHERE recipient_stringer_id = :me AND read_at IS NULL (single statement, chokepoint-applied).

Marking-unread is not supported in V2 — keep the operation set minimal.

Write-on-event hook surface

Notifications are written in the same transaction as the triggering domain write. Three integration points, all SQLAlchemy session-event-driven:

  1. Order lifecycle hooks (per ADR-0007):
  2. after_flush on Order writes the order_strung / order_returned / order_paid / order_admin_overridden notification with recipient_stringer_id = order.stringer_id. Same transaction as the order commit.
  3. For my_order_visible_to_third_party (Rule-#3 trigger from ADR-0004): the same after_flush queries person_stringer_share for active rows targeting order.client_profile.person_id and writes one notification per match, recipient = order.stringer_id (the originating stringer who needs to know their new job is visible to a third party).

  4. Share lifecycle hooks (per ADR-0004):

  5. after_flush on OrderShare / PersonStringerShare writes share_granted_to_me (recipient = grantee_stringer_id / target_stringer_id).
  6. On UPDATE setting revoked_at, writes share_revoked_from_me (same recipient).

  7. Catalogue moderation hooks (per NFR-8 / UC-7):

  8. after_flush on CatalogueSubmission INSERT (status = 'pending') writes catalogue_submission_pending to every admin (V2: only Stefan; query stringers WHERE role = 'admin').
  9. On admin decision UPDATE, writes catalogue_submission_decided to submitted_by_stringer_id.

  10. Email-fail hook (per integrations.md + #44):

  11. On a Resend SMTP failure during receipt send, the EmailSender writes a receipt_email_failed notification with recipient_stringer_id = order.stringer_id. The receipt itself is still re-renderable; this notification surfaces the email-only failure.

Single-transaction guarantee: all four hooks fire in after_flush (or, for the email-fail case, in the same SQLAlchemy session as the failed-send recording). If the surrounding commit aborts, the notification row is rolled back too. No worker, no outbox, no retry — by design at V2 scale.

Notification of the actor themselves: suppressed by default. Stringer A who sets paid_at on their own order is not notified that they did it. Catalogue submission's _decided event is sent to the original submitter (the actor is the admin, not the recipient — different).

Idempotency (dedup_key)

Some triggers can re-fire for the same logical event (e.g. an after_flush hook running on a transaction that retries due to a concurrency error). The dedup_key column + partial unique index makes the write idempotent.

Convention for dedup_key:

  • order_strung:{order_id} — unique per order.
  • order_paid:{order_id} — unique per order; if PAID is unlocked + re-set, a new row is desired (different read-state semantics) — so the dedup key embeds an attempt suffix: order_paid:{order_id}:{paid_at_epoch}. Each distinct paid_at value gets its own row.
  • share_granted_to_me:{order_share_id} — unique per grant.
  • catalogue_submission_pending:{submission_id}:{admin_recipient_id} — unique per (submission, admin) pair.
  • receipt_email_failed:{order_id}:{strung_at_epoch} — unique per emit attempt.

The INSERT ... ON CONFLICT (recipient_stringer_id, dedup_key) WHERE dedup_key IS NOT NULL DO NOTHING pattern makes the write a safe no-op on re-fire.

For events where dedup is meaningless (e.g. one-off admin override), dedup_key is left NULL.

V3 dispatcher hand-over

The V3 multi-channel dispatcher (per v3-vision §C2) reads from the same notifications table:

  • Polls for rows where dispatch_attempted_at IS NULL AND the recipient's per-channel notification_prefs opts in.
  • For each row, attempts the configured channel(s), records the outcome in dispatch_attempted_at / dispatch_succeeded_at / dispatch_meta.
  • Honors the recipient's per-kind opt-out (notification_prefs.kinds.{kind} = false → mark dispatch_attempted_at = now() with dispatch_meta = {skipped: 'opted_out'}).

The V3 dispatcher is a separate process (likely RQ/arq); ADR-0005 commits only the V2 schema + the V3-ready columns (dispatch_*). V3's dispatcher ADR will spec the worker, retry, and channel-specific delivery. The V2 in-app inbox does not depend on those columns — it serves rows directly from notifications regardless of dispatch state.

No coupling V2 → V3: if V3 never ships, the V2 inbox keeps working forever. The dispatch_* columns sit NULL.

Retention / archival

  • Read notifications: kept for 90 days after read_at, then hard-deleted by a daily cron-style job (small DELETE; runs in-process at app boot, since there's no worker). Default; documented as configurable via RBO_NOTIFICATIONS_READ_RETENTION_DAYS=90.
  • Unread notifications: never auto-deleted. If the inbox grows unboundedly, the recipient is the problem (and the unread-badge cap surfaces it).
  • No soft-delete. A read-then-deleted notification is gone; the audit trail of what happened lives in share_audit / order_state_audit, not in notifications. The two surfaces are intentionally distinct: audit = forensic, notifications = inbox.

Failure modes

  • Notification write fails inside the triggering transaction. The whole transaction rolls back. The triggering domain write (e.g. strung_at set) is also rolled back. The user sees a 500; the order is unchanged. Acceptable: the "the order is strung but I never got the alert" silent-failure is a worse UX than a loud retry.
  • Notification table is hot (concurrent inserts on the same recipient_stringer_id). Postgres handles row-level concurrency; the partial unique index on dedup_key serializes any genuine duplicate attempt. At V2 volume (~hundreds of rows/year per stringer) contention is irrelevant.
  • The 4KB payload soft-cap is exceeded. CI gates on a render-the-template test that asserts size; production write that exceeds it is logged as a warning but accepted.

Required tests (this ADR mandates them)

  1. Schema reflection. Assert the notifications table has the documented columns + enums + partial indexes.
  2. Write-on-event atomicity. A failing INSERT INTO notifications rolls back the triggering Order commit. Verified by mocking a CHECK violation on the notification row.
  3. Tenancy. A stringer fetching /notifications sees only rows with recipient_stringer_id = current_stringer_id. The chokepoint test (per ADR-0004 test #1) is restated against notifications.
  4. Mark-as-read. POST /notifications/{id}/read sets read_at; second call is a no-op (still read_at from first call). POST /notifications/read-all clears all unread for current_stringer_id.
  5. Idempotency. Inserting two rows with the same (recipient_stringer_id, dedup_key) returns one row (ON CONFLICT DO NOTHING).
  6. Unread-badge cap. With 1500 unread rows, the badge query returns 999+ (or returns 1000 + the UI clamps); query plan still uses the partial index.
  7. Lifecycle trigger writes. Setting strung_at on an Order writes exactly one order_strung row; setting paid_at writes exactly one order_paid; admin override writes one order_admin_overridden. Per ADR-0007.
  8. Share trigger writes. INSERT on OrderShare writes share_granted_to_me to grantee; UPDATE setting revoked_at writes share_revoked_from_me to grantee. Per ADR-0004.
  9. Rule-#3 fan-out. INSERT on Order whose ClientProfile.person_id has an active person_stringer_share writes my_order_visible_to_third_party to order.stringer_id for each active grant.
  10. Self-action suppression. Stringer A's setting strung_at on their own order does NOT produce a notification to Stringer A. (The order_state_audit row is still written; that's separate.)
  11. Retention. A notification with read_at = now() - 91 days is hard-deleted by the next retention pass.
  12. V3-ready columns are NULL by default. New rows have dispatch_attempted_at = NULL etc. — no V2 code writes them.

Consequences

Good

  • One table, one query, one chokepoint. The inbox is a single SELECT against a partial-indexed table. The badge is a single count(). Both stay sub-ms at V2 scale.
  • Atomic with the triggering write. "The order is strung but I never got the alert" is structurally impossible — they commit together or both roll back.
  • PgBouncer-friendly. No worker, no LISTEN/NOTIFY, no advisory locks. Plain INSERT in the request transaction.
  • V3 dispatcher slot is a column-add, not a re-architecture. The V3 worker reads the same table; V2 keeps working unchanged.
  • OQ-3 is fully realized: in-app surface is mandatory and live in V2; per-channel email opt-in is honored by the V3 dispatcher reading Person.notification_prefs. The V2 in-app surface does not need to consult notification_prefs (in-app is the unconditional fallback).
  • Read-state lives on the row — clean, queryable, no separate join table.
  • Idempotency via dedup_key — re-fires don't duplicate. The paid_at-set-twice case is handled (epoch-suffixed key).
  • Each ADR-0004 / ADR-0007 trigger has a documented write site. No dangling references.

Costs we accept

  • Synchronous write fattens the request transaction. A "string this racket" POST now writes Order + audit + 1+ notifications in one commit. At V2 scale this is microseconds; at V3 client-portal scale (Rule-#3 fan-out for a Person granting to many stringers) the fan-out could be larger — but that's still bounded by the number of stringers a Person grants to (small number).
  • Notification write failure aborts the domain write. Loud, but means a transient DB hiccup during notification INSERT can fail an order save. Acceptable; the alternative (silent-skip) is worse.
  • Per-stringer table will grow at ~hundreds of rows/year per stringer + the retention cleanup. At 5 stringers × 5 years × ~500 notifications/yr × 90-day retention = small. The partial indexes keep query cost bounded by unread count, not total.
  • payload JSONB is loosely typed. Mitigated by a per-kind Pydantic schema for write-time validation (the writer enforces; the reader trusts). Adds discipline, not infrastructure.
  • No multi-recipient notifications in V2. The catalogue-submission-pending fan-out writes one row per admin. At V2's one-admin reality this is one row; at multi-admin (V2.x) it's still cheap.
  • Hard-delete on retention loses the inbox-history. Acceptable: audit lives in share_audit / order_state_audit, not here.

Open questions (Stefan-confirm — all defaulted)

  1. Read-retention window — defaulted to 90 days. Could be 30 (faster reclaim) or 365 (longer history). 90 is the comfortable default for "I want to scroll back and remember last quarter's grants."
  2. Unread-badge cap — defaulted to 999+. Could be 99+ (more compact) or unlimited (heavy). 999 is a generous cap that almost no V2 stringer will hit.
  3. Self-action suppression — defaulted to on. Could be off (every action notifies the actor, with a was_self_action flag for the UI to render differently). On is the standard inbox UX.
  4. Inbox sort order — defaulted to unread-first then created_at DESC. Could be pure created_at DESC. Unread-first matches the "what needs my attention" mental model.
  5. notifications page size — defaulted to 50. UX-influenced; flagging because it changes the listing query shape if Stefan wants infinite scroll.

All five default to the values above.

Cross-references