Guardrails
Bedrock Guardrails applied at model invocation time — topic filters, content filters, PII redaction, grounding checks — referenced from templates so the boundary applies to every agent on the template.
Control is the harness’s governance layer — the component that turns ThinkWork from a demo into a production system, and the load-bearing implementation of the four operating guarantees (Reliability, Efficiency, Security, Traceability). It answers two questions that every real deployment has to answer:
Both matter equally. A system that only enforces boundaries is a black box; a system that only logs is a landmine waiting for a safety question. ThinkWork treats Control as one first-class concept instead of scattering it across three unrelated surfaces.
The five governance controls below map 1:N to the four operating guarantees. The guarantees are the language a buyer or operator can grab onto; the controls are the implementation that ships in the admin web. Every guarantee has at least one shipping control behind it.
| Guarantee | What it commits to | Shipping controls |
|---|---|---|
| Reliability | Fault recovery, idempotent writes, behavior consistent under the same inputs. | Approved agent capabilities (templates pin model + guardrails so behavior is reproducible); Security + accuracy evaluations (regression detection per template version). |
| Efficiency | Token budgets, Space spend caps, low-latency interactive paths. | Cost control and analysis (per-turn cost capture, Space/tenant budget enforcement that pauses at threshold). |
| Security | Per-agent capability grants, sandboxed execution, I/O filtering for prompt injection and PII. | Runs in your AWS (the harness deploys into customer VPC; data, IAM, and network stay in the customer’s account); Approved agent capabilities (Bedrock Guardrails attached at the template level — content filters, topic filters, PII redaction, prompt-injection detection via Strands safety.py); Security + accuracy evaluations. |
| Traceability | End-to-end traces per turn, explainable decisions, auditable state. | Runs in your AWS (audit log lives in customer’s S3, partitioned by tenant, no delete permission on the Lambda role); Centralized management (one admin console aggregates threads, turns, cost, guardrail activations, and audit events under the same threadId/turnId keys so incident reconstruction is one query). |
The mapping is 1:N — most controls implement more than one guarantee — because the guarantees are qualities the harness commits to, and the controls are mechanisms that often serve multiple qualities at once. “Runs in your AWS” is simultaneously a Security boundary (your IAM, not ours) and a Traceability mechanism (your audit log, your retention policy). That’s intentional.
A single agent answering a single user’s questions rarely needs guardrails, rarely needs a budget, and rarely needs an audit trail — because a human is watching. A fleet of agents running across multiple tenants, handling integration events while their operator is asleep, doesn’t have that luxury.
The problems that emerge with scale:
Control is the concept that bundles those three concerns together. One place to configure boundaries. One place to see what happened. One set of primitives that work across every agent, template, and tenant in the deployment.
Guardrails
Bedrock Guardrails applied at model invocation time — topic filters, content filters, PII redaction, grounding checks — referenced from templates so the boundary applies to every agent on the template.
Budgets, Usage, and Audit
Per-turn cost capture from OTel spans, tenant-level budget enforcement that can suspend agents when thresholds hit, and an append-only audit log for every invocation.
Control doesn’t own agents, threads, or memory. It wraps them:
PAUSED rather than silently continuing.The fact that guardrail activations land in the thread timeline, cost lands on the turn, and audit lands in S3 — all addressable by the same threadId + turnId — is what makes incident reconstruction tractable. You’re not stitching three disjoint systems together.
Control is configured per tenant. A single ThinkWork deployment typically has multiple tenants (customers, internal business units, or environments), and their control surfaces are independent:
This means one deployment can host production work, internal eval runs, and external customer workloads without their Control posture bleeding across. Row-level security on every relevant table enforces the boundary at the database level; IAM policies enforce it at the S3 level.