Skip to content

ADR-0004a: R-FADP-4 — Consent-by-proxy provenance on Person

  • Status: Proposed
  • Date: 2026-05-06
  • Decider(s): Theo (SA); subject to Iris override on review
  • Amends: ADR-0004
  • Closes: #123

In-place vs. amendment file. This amendment lives in a separate file to avoid a merge conflict with Iris's parallel work on fadp-implementation-asks.md. When Round 5 lands and the dust settles, a future cleanup MR may roll this into ADR-0004 directly; until then, ADR-0004 + ADR-0004a are read as one document.

Context

ADR-0004 split client identity into Person (platform-level) + ClientProfile (per-stringer). It documented the Stringer-created draft Person flow: a stringer enters a new client without an email, a Person row is created, and the row may sit unclaimed for months until the human magic-links in (V3 portal).

Iris's fadp-implementation-asks.md § R-FADP-4 requires every Person row to carry provenance — who created the row and via what flow — to back the consent-by-proxy legal basis under FADP Art. 19–20 (transparency about how data was collected). For a draft Person created by a stringer on behalf of a not-yet-onboarded client, the legal basis at creation is "consent-by-proxy under legitimate interest" (per fadp-posture.md § Stringer-created draft Persons). The provenance — which stringer, when — is the audit trail backing that legal basis.

Without provenance: a Person who later asks "how did you get my data?" gets a non-answer; the V3 magic-link "we already had a record for you, created by Stringer X on <date>" ratification screen (A-CONS-2) cannot be rendered.

Iris flagged the choice between two shapes for Theo:

  • Option A. Enum Person.created_by_kind + scalar Person.created_by_id (FK by convention; nullable for kinds with no FK target).
  • Option B. Polymorphic with explicit per-kind FK columns (created_by_stringer_id, created_by_import_batch_id, …).

Iris's acceptance criteria (A-FADP-4.1 through A-FADP-4.6) cover four created_by_kind values: stringer, self (V3 portal magic-link self-create), migration (Vera's M15 ETL), and the implicit system (placeholder for future system-emitted Persons; not used in V2). Provenance is immutable post-creation; survives stringer offboarding (the FK resolves to a [scrubbed] display name but the chain is intact); and is not cleared by Person scrub (it's metadata about the platform's relationship with the Person, not PII).

Options

Schema shape

  • (SA-A) Enum + FK-by-convention. created_by_kind enum (stringer | self | migration | system); created_by_id UUID nullable (FK by convention to stringers.id when kind=stringer, to persons.id when kind=self, NULL when kind=migration or kind=system). Two columns total. The FK-by-convention pattern is already used in this codebaseshare_audit.actor_kind + share_audit.actor_id and share_audit.target_kind + share_audit.target_id (per ADR-0004 §"Audit") use the same shape; order_shares.granter_kind + granter_id is the closely-related polymorphic-FK case (per ADR-0004 §"order_shares").
  • (SA-B) Polymorphic with per-kind FKs. created_by_stringer_id (FK NULLABLE), created_by_self_person_id (FK NULLABLE), created_by_import_batch_id (FK NULLABLE if a future import_batches table lands; else NULL+CHECK). Plus a discriminator (created_by_kind) to disambiguate. Three or four columns; per-kind referential integrity is enforced by Postgres FKs; CHECK constraint ensures exactly one created_by_*_id is non-null.

Where to store

  • (WS-1) Columns on Person (chosen for both options). Provenance is a per-row property; the column-on-row pattern is consistent with Person.deleted_at, Person.scrubbed_at, Person.email_verified_at. One read, no JOIN.
  • (WS-2) Surface via share_audit event-kind only (Iris's alternative — "either two columns OR a share_audit row"). Avoids new columns; provenance is reconstructable via share_audit lookup. Costs one query per Person to reconstruct provenance; loses the per-row read affordance for the DSAR generator and the merge-tool surface.

Decision

Pick Option A (created_by_kind enum + created_by_id FK-by-convention) on Person.

Schema delta

Add two columns to persons:

Column Type Constraints Notes
created_by_kind created_by_kind enum NOT NULL Enum values: stringer | self | migration | system. New Postgres enum type owned by this migration.
created_by_id uuid NULL allowed FK by convention. Resolves to stringers.id when kind=stringer; to persons.id (the row's own id, post-INSERT) when kind=self; NULL when kind=migration (no per-row import-batch ID in V2 — Vera's M15 ETL is one batch) or kind=system (no actor).

No FK constraint at the DB level — same convention as share_audit.actor_id (per ADR-0004 §"Audit"). Justification: the FK target depends on created_by_kind, and Postgres does not support polymorphic FKs natively. Application-level write-time check enforces it (per CHECK constraint shape below).

CHECK constraint (kind ↔ id consistency):

CHECK (
  (created_by_kind = 'stringer' AND created_by_id IS NOT NULL)
  OR (created_by_kind = 'self' AND created_by_id IS NOT NULL)
  OR (created_by_kind = 'migration' AND created_by_id IS NULL)
  OR (created_by_kind = 'system' AND created_by_id IS NULL)
)

Index: none required for V2. Provenance lookup is per-row (SELECT created_by_kind, created_by_id FROM persons WHERE id = ?) — already PK-served. A future "all Persons created by Stringer X" admin query would benefit from (created_by_kind, created_by_id) partial index; defer until that query exists.

Immutability is enforced at the ORM layer, not at the DB layer. SQLAlchemy validates() decorator on Person.created_by_kind and Person.created_by_id raises if either is set after the row's INSERT (__init__ is the only path that may set them). This matches A-FADP-4.5 ("Provenance is immutable post-creation").

Why Option A over Option B

Two reasons, both about consistency.

First, the codebase already commits to FK-by-convention for polymorphic-actor cases. share_audit.actor_kind + actor_id, share_audit.target_kind + target_id, and order_shares.granter_kind + granter_id (the latter with a CHECK enforcing exactly one of two FK columns is non-null — a hybrid of (SA-A) and (SA-B)). Adding a fourth pattern (per-kind FK columns on Person) for the same shape would be a deliberate divergence from the convention this codebase has already accepted in three places. The cost of the convention is exactly what (SA-B) would buy — a Postgres-enforced FK — and we already pay that cost in share_audit at scale.

Second, per-kind FK columns scale poorly with future provenance kinds. A system Person (a future "platform created this row in response to event X") has no FK target. A future api_token provenance kind (a future API client creating a Person) would need a fifth FK column. The two-column shape absorbs every future kind by adding an enum value; the per-kind shape needs a schema migration per kind. At this domain's evolution rate (provenance kinds are added every few quarters, not every release), the two-column shape is the durable answer.

The trade-off — losing DB-level referential integrity on created_by_id — is acceptable because (a) created_by_* is immutable post-creation, so the FK can only break via a stringer hard-delete (which is forbidden under the stringer-tombstone rule per stringer-lifecycle.md § Cascade), and (b) the test suite enforces the CHECK constraint shape and the application-level write-time consistency.

Population rules (per A-FADP-4.2 through A-FADP-4.4)

Creation flow created_by_kind created_by_id
Stringer adds a client (with or without email) stringer <stringer.id> of the creating stringer
V3 portal self-create via magic-link (no prior draft) self <person.id> of the row itself (set in a flush() post-INSERT pass)
Self-Profile lazy-create (is_self_for_stringer = TRUE, per stringer-lifecycle.md § Self-Profile slot) self <person.id> of the row itself
Vera's M15 ETL migration NULL
Future system-emitted Person system NULL

Note on kind=self. The created_by_id = <person.id> pattern (a self-reference) is unusual but intentional: it matches A-FADP-4.3 verbatim and makes the DSAR response uniform ("created by <actor>" → for self-create, the actor is the Person itself, which is what the ratification screen says). The alternative (NULL for kind=self) would require the DSAR template to special-case the rendering; the self-reference is one read either way.

Migration impact (for Pax / Vera)

  • One Alembic migration adds the enum type, the two columns, and the CHECK constraint.
  • Backfill for existing V2 rows:
  • Vera's M15 ETL output: every row gets created_by_kind = migration, created_by_id = NULL. No backfill needed if the ETL writes the new columns at INSERT time (the migration must land before the ETL runs).
  • Stefan's pre-ETL self-Person and any test fixtures: created_by_kind = self, created_by_id = <self.id> (one-time UPDATE in the migration's data step, idempotent).
  • No backfill for share_audit. R-FADP-4 is independent of share_audit; no event-kind extension is required by this ADR. (Iris's A-CONS-1 had "OR surface via share_audit" as an alternative; we chose columns-on-Person, so the audit table stays as-is.)

Person merge interaction (per A-FADP-4.5)

Provenance is unchanged by a Person merge:

  • The surviving_person.created_by_* is unchanged (it reflects who created the surviving row, which is still true).
  • The merged_person.created_by_* is unchanged (the row is soft-marked merged, not deleted; its provenance is still queryable for forensics).
  • The merge itself is a separate share_audit row (event_kind = person_merge per Iris R-FADP-7) plus an admin_audit_log row per ADR-0011.

This is the correct shape: provenance is "who created this row?" — a fact about the row that does not change when the row is merged. The merge is a separate event in a separate log.

DSAR exposure (per A-FADP-4.1)

The DSAR response (per Iris's R-FADP-3) renders provenance human-readable:

  • kind=stringer → "Your record was created by <stringer.display_first_name> <stringer.display_last_name> on <created_at>."
  • kind=self → "You created your record on <created_at> via the magic-link signup."
  • kind=migration → "Your record was migrated from the V1 spreadsheet on <created_at>."
  • kind=system → "Your record was created by the platform on <created_at>."

Stringer offboarded? The display name resolves to "[scrubbed]" per stringer-lifecycle.md § Cascade; the audit chain is intact (the created_by_id still points at the tombstone row).

What this ADR does NOT cover

  • The share_audit event-kind extensions (R-FADP-7 — person_erasure, person_merge, consent_change) — owned by Iris's R-FADP-7 implementation; this ADR commits to not using share_audit for Person-creation provenance, so no extension is needed for R-FADP-4 itself.
  • The dsar_log table schema — owned by Iris's R-FADP-3.
  • The Person scrub field-list — owned by Iris's docs/requirements/fadp-posture.md. This ADR commits to "scrub does NOT clear created_by_*" (per A-FADP-4.6); the scrub field-list must reflect that.
  • The admin person-merge UX surface — owned by docs/design/admin-person-merge.md; this ADR confirms the design's reading of the provenance fields (it surfaces them in the side-by-side comparison).

Required tests (this ADR mandates them)

  1. CHECK constraint test. INSERT a Person with created_by_kind=stringer and created_by_id=NULL → DB rejects. Same with kind=migration and id != NULL.
  2. Population test (per A-FADP-4.2 / 4.3 / 4.4). Each Person-creation code path (stringer-adds-client; V3 portal magic-link; self-Profile lazy-create; M15 ETL) writes the documented (kind, id) tuple.
  3. Immutability test (per A-FADP-4.5). Attempting to UPDATE created_by_kind or created_by_id post-creation raises at the ORM layer.
  4. Scrub-survival test (per A-FADP-4.6). A scrubbed Person retains its created_by_* values (assert directly on the row post-scrub).
  5. Merge-survival test. Surviving and merged Persons both retain their original created_by_* post-merge.
  6. DSAR rendering test. Each kind value renders the documented human-readable string in the DSAR JSON payload.

Consequences

Good

  • Provenance is one read, on the row that needs it. No JOIN, no share_audit lookup, no special case for the V3 ratification screen.
  • FK-by-convention is the codebase's existing pattern. Three other places (share_audit.actor_*, share_audit.target_*, order_shares.granter_*) already use it. Convention-aligned.
  • Future provenance kinds are an enum-value addition. No schema migration per kind.
  • Immutability is structurally enforced. ORM-layer validates() makes the rule local and testable; CHECK constraint catches DB-direct writes.
  • DSAR generator is trivial. One per-kind template; same shape as Iris's other DSAR sections.

Costs we accept

  • DB-level FK constraint is absent on created_by_id. Same cost the codebase already pays in share_audit. Mitigated by immutability and by the stringer-tombstone rule (the FK target never disappears).
  • Self-reference for kind=self. One unusual but documented pattern. The DSAR generator handles it uniformly.
  • One enum type to maintain. A future kind addition is ALTER TYPE created_by_kind ADD VALUE 'X' plus a CHECK update. Cheap.
  • Provenance is metadata, not PII. Iris's A-FADP-4.6 commits us to NOT scrubbing the created_by_* fields on Person scrub. This is correct under FADP (provenance is about the platform, not the data subject) but worth flagging — a future legal review may push back, and we'd revisit then.

Alternatives considered (and why not)

  • Option B (per-kind FK columns). Adds DB-level FK at the cost of N columns per future kind. Loses convention-alignment with share_audit. Rejected.
  • Surface via share_audit event-kind only (WS-2). Loses the per-row read affordance; costs a JOIN on every DSAR generation; doesn't match the row-is-the-provenance mental model for the merge-tool surface. Rejected.
  • Single created_by_actor polymorphic JSONB column. Defeats the enum + FK-by-convention shape; opaque to indexed queries; harder to migrate. Rejected.
  • Use only created_at + a separate person_creation_events table. Adds a table for what is two columns. Over-engineered. Rejected.

Cross-references