Developer guide
How to extend the compliance module. For background read overview. For architectural shape read architecture.
Procedures
Section titled “Procedures”- Adding a new event type
- Audit-event tier semantics
- Cross-runtime emit path (Strands Python)
- Adding a new compliance Lambda
- Where the tests live
Adding a new event type
Section titled “Adding a new event type”When to use this: A new platform action needs auditing — adding to the SOC2 starter slate, or wiring a Phase 6 policy event.
-
Append the dotted-lowercase name to
COMPLIANCE_EVENT_TYPESinpackages/database-pg/src/schema/compliance.ts. Convention:<domain>.<action>(e.g.,agent.created,auth.signin.success). Keep the array order stable; append rather than insert. -
Add a redaction allow-list entry in
packages/api/src/lib/compliance/redaction.ts. The allow-list says which payload fields persist into the audit row verbatim; everything else is redacted to<REDACTED>at write time. Reviewer guidance: be conservative — fields you don’t list never reach the audit log. -
Identify call sites that should emit. Each call site picks a tier:
- Inside the originating
db.transaction: control-evidence (audit failure rolls back the originating action). - Wrapped in
try { void emitAuditEvent(...) } catch { logger.warn(...) }: telemetry (audit failure does not block the action).
See audit-event tier semantics below for the decision rule.
- Inside the originating
-
Add an integration test. The cross-cutting tests live in
packages/api/test/integration/compliance-event-writers/(e.g.,cross-cutting.integration.test.ts). Each test fires the originating mutation, asserts the outbox row landed with the rightevent_type, and asserts the drainer wrote it toaudit_eventswith a valid hash chain link. -
Regenerate GraphQL codegen if the new type appears in the GraphQL
ComplianceEventTypeenum (mirror inpackages/database-pg/graphql/types/compliance.graphql):Terminal window pnpm --filter @thinkwork/api codegenpnpm --filter @thinkwork/admin codegen -
Verify the drift snapshot test still passes.
packages/api/src/__tests__/compliance-event-type-drift.test.tsasserts the GraphQL enum values match the runtimeCOMPLIANCE_EVENT_TYPESslate exactly. Adding a new type without updating the GraphQL schema fails here.
PR template: see #903 (U5) for an example of wiring emitAuditEvent at multiple call sites.
Audit-event tier semantics
Section titled “Audit-event tier semantics”The emitAuditEvent helper is tier-agnostic; the call site picks how to handle a write failure.
| Tier | Pattern | Failure mode | Use for |
|---|---|---|---|
| Control-evidence | await emitAuditEvent(tx, {...}) inside an existing db.transaction(async tx => {...}) | Audit write failure throws; the transaction rolls back; originating mutation fails with a user-visible error | Security-relevant events (auth, data export, agent CRUD, MCP CRUD, governance file edits) |
| Telemetry | try { await emitAuditEvent(db, {...}) } catch (err) { logger.warn({err}, "audit-emit-failed") } | Audit write failure logs + fires an operator alert; originating action proceeds | High-volume informational events that must not block production traffic |
Default to control-evidence for any event whose absence from the audit log would mean an auditor can’t reconstruct what happened. Drop to telemetry only when the action’s value to the user clearly outweighs the audit gap (e.g., a high-throughput read-only operation that would degrade UX if audit DB pressure causes the action to fail).
Master plan reference: R6 in docs/plans/2026-05-06-011-feat-compliance-audit-event-log-plan.md.
Helper signature + behavior: packages/api/src/lib/compliance/emit.ts.
Cross-runtime emit path (Strands Python)
Section titled “Cross-runtime emit path (Strands Python)”The Python Strands runtime can’t share TypeScript code, so it emits via a REST round-trip into the graphql-http API. The path is wired so retries are idempotent.
Strands side:
- The container constructs a
ComplianceClientat boot fromTHINKWORK_API_URL+API_AUTH_SECRET(resolved from Secrets Manager). Source:packages/agentcore-strands/agent-container/(compliance_client.py). - On a relevant runtime event, the client generates a fresh UUIDv7
event_idlocally (deterministic-prefix timestamp + random suffix; seepackages/agentcore-strands/agent-container/helper). - The client POSTs
/api/compliance/eventswith bearerAPI_AUTH_SECRET, body{event_id, tenant_id, actor_id, event_type, payload, ...}. - Snapshot env at coroutine entry — never re-read
os.environmid-handler. This is a load-bearing rule perfeedback_completion_callback_snapshot_pattern— env vars can be re-injected mid-process by Lambda’s warm-container update path, and a re-read mid-handler can pick up stale values.
API side:
- The
compliance-eventsREST handler (packages/api/src/handlers/compliance.ts) validates the bearer, validates the body schema, and INSERTs intoaudit_outbox. - The outbox table has
uq_audit_outbox_event_id(unique onevent_id). A retry with the sameevent_idis a no-op at the DB layer — the second INSERT raises a unique-constraint violation that the handler catches and treats as success. This is the idempotency guarantee.
Drainer side:
- The single-writer drainer Lambda picks up the row in the next 5-second poll cycle, computes the chain hash, INSERTs into
audit_events. Strands events are indistinguishable from Yoga events at this point — they share the same chain.
Strands runtime emit shipped in U6 (#911).
Adding a new compliance Lambda
Section titled “Adding a new compliance Lambda”When to use this: A new background process is needed — e.g., a daily verifier scheduled run, an aggregation job, a Phase 6 policy evaluator.
-
Lambda body at
packages/lambda/compliance-<name>.ts:- Module-load env snapshot via
getXxxEnv()helper (mirrorpackages/lambda/compliance-anchor.tsgetAnchorEnv). - Lazy clients (S3 / Secrets Manager / pg) cached at module scope; reset on connection error.
- For SQS-triggered Lambdas: implement the
ReportBatchItemFailurespartial-failure protocol; CAS-guard on the row state to make re-deliveries no-ops. - For scheduled Lambdas: use
ReservedConcurrentExecutions=1if the work is single-writer-sensitive (drainer, anchor).
- Module-load env snapshot via
-
Build entry in
scripts/build-lambdas.sh:Terminal window build_handler "compliance-<name>" \"$REPO_ROOT/packages/lambda/compliance-<name>.ts"If the Lambda imports
@aws-sdk/lib-storage,@aws-sdk/s3-request-presigner,@aws-sdk/client-bedrock-agentcore, or other clients not in the Lambda runtime SDK, add the handler name to the conditional that flips the build toBUNDLED_AGENTCORE_ESBUILD_FLAGS— otherwise the import will fail at cold start. -
Terraform handler resource in
terraform/modules/app/lambda-api/handlers.tf:- Standalone resource (NOT in the
for_eachpool) when the Lambda needs a per-key IAM role / env / source_code_hash. The compliance-anchor and compliance-export-runner Lambdas use this pattern — isolates blast radius from the 60+ shared handlers. - for_each pool when the Lambda fits the shared
aws_iam_role.lambdapermissions and needs no special env. The drainer used this pattern in U4.
- Standalone resource (NOT in the
-
Post-deploy smoke at
packages/api/src/__smoke__/compliance-<name>-smoke.ts+scripts/post-deploy-smoke-compliance-<name>.sh. Pin on dispatch status in the response payload, not on log filtering (feedback_smoke_pin_dispatch_status_in_response). CloudWatch log grep is fragile; downstream-state pinning balloons smoke runtime. -
GHA workflow gate in
.github/workflows/deploy.yml— add a new job afterterraform-applythat runs the smoke. Mirror thecompliance-anchor-smokeandcompliance-export-runner-smokejobs in shape.
The U11.U2 + U11.U3 PRs (#948, #950) are a recent end-to-end example of all five steps for the export runner.
Where the tests live
Section titled “Where the tests live”When adding a new emit site, the integration test in packages/api/test/integration/compliance-event-writers/ is the gate that proves the cross-cutting path works (originating mutation → outbox row → drainer → audit_events row). Unit tests of the emit helper alone don’t catch wiring regressions.