Architecture

ThinkWork is the open Agent Harness for Business. The harness is the engineered structure around the model that turns raw, non-deterministic agent output into reliable, traceable, auditable work. This page is the architecture of that harness — first the mechanics that make it a harness, then the components, then the AWS deployment topology.

There are three useful views of the system:

The harness mechanics — the agent loop, the four operating guarantees the runtime enforces, the metaphor that ties it together
The conceptual model — how the core components fit together to do work
The infrastructure model — what gets deployed in your AWS

Every resource lives in your AWS account. There is no shared infrastructure, no callbacks to external control planes, and no telemetry sent outside your account. That matters because ThinkWork is not just trying to run agents — it is trying to give customers an open, customer-owned harness for AI work.

The harness mechanics

A model alone is a wild horse. The harness is what turns its motion into useful work — direction, control, recovery, accountability. Every concept in the rest of this page (Agents, Spaces, Threads, Memory, Integrations, AgentLoops, Control) earns its place by implementing one or more pieces of that harness. This section names the mechanics directly.

The PPAF agent loop

Every agent turn in ThinkWork follows the same four-phase cycle: Perception, Planning, Action, Feedback (PPAF). The harness implements each phase as a distinct, observable surface — that’s what makes turns reproducible and auditable instead of opaque.

flowchart LR
  P[Perception<br/><span style="font-size:11px">Thread + Memory + tool catalog</span>] --> Pl[Planning<br/><span style="font-size:11px">Model invocation in AgentCore</span>]
  Pl --> A[Action<br/><span style="font-size:11px">Tool calls, sandbox execution, replies</span>]
  A --> F[Feedback<br/><span style="font-size:11px">Turn record, cost, evals, audit</span>]
  F -.->|next turn| P

Perception. The harness assembles the agent’s view of the world: the thread’s prior messages, Memory context, the tool catalog, and current channel context. This is what Threads + Memory + Integrations produce together — a structured snapshot the model reasons over.
Planning. The model (via the selected AgentCore runtime on Bedrock) decides what to do next. ThinkWork doesn’t replace the model’s reasoning; it bounds the inputs and channels the outputs.
Action. Tool calls fire (with capability gating), sandbox code executes (with isolation), replies emit (through integrations). Every action is captured as it happens.
Feedback. The turn closes with a durable record: tokens, cost, tool outcomes, guardrail decisions, evaluator scores. That feedback is what the next turn perceives — and what an operator reads after the fact.

The four phases are why the docs structure the components the way it does: Threads + Memory + Integrations are the perception substrate; Agents are the planning + action layer; Control is the feedback layer (audit, cost, evaluation). AgentLoops time recurring or manually rerunnable goal work from the outside.

The four operating guarantees

Beyond the loop, the harness has four operational properties it commits to delivering on every turn — the operating guarantees. These are the four promises a buyer or operator can grab onto and the four dimensions an evaluator can measure:

Reliability. Fault recovery from checkpoints, idempotent writes, behavior consistent under the same inputs. Implemented in: thread durability + RLS, AgentCore retry semantics, AgentLoop run ledgers, and Step Functions for Workflows.
Efficiency. Token budgets and Space/tenant spend caps, low-latency interactive paths, throughput that scales with usage. Implemented in: Control’s budget enforcement, AppSync for streaming, the cost ledger.
Security. Per-agent capability grants, sandboxed execution, I/O filtering for prompt injection and PII. Implemented in: Templates, AgentCore code sandbox, Bedrock Guardrails (content_filters), and runtime-level tool gating.
Traceability. End-to-end traces per turn, explainable decisions, auditable state. Implemented in: the turn record, the append-only audit log on S3, evaluator scores per turn.

The five governance controls in Control map 1:N to these four operating guarantees — for example, “Runs in your AWS” implements Security and Traceability; “Cost control and analysis” implements Efficiency. The operating guarantees are the language; the controls are the implementation.

State separation: model as compute, harness as state

A single principle underlies everything below: ThinkWork treats the model as stateless compute. Durable state lives in the harness — threads, memory, audit, cost, policies, and execution records.

This is the cleanest way to explain why the product exists. A model returns a response and forgets. The harness remembers everything that surrounded the response: which thread it ran inside, what memory was assembled, which tools fired, what the guardrail decided, what the cost was, what the evaluator scored. None of that lives in the model. All of it lives in the harness.

That separation is what makes turns reproducible (re-run a turn against a different model, the harness state is still intact), auditable (every turn has a durable record independent of what the model returned), and recoverable (a failed turn is a thread state, not a lost conversation). It also makes the model swappable — rotating from Sonnet to Opus is a template-id change because the harness owns everything around the model.

The conceptual model maps onto the harness

The conceptual model below — Agents, Spaces, Threads, Memory, Integrations, AgentLoops, Control — is what most of the rest of this site documents. Reading the components against the harness mechanics keeps the picture coherent: every component is solving a piece of the harness problem (state, perception, planning, action, feedback, governance), not floating in isolation.

System model

Before getting into the deployment tiers, it helps to frame the runtime in product terms:

External systems and users
    ↓
Threads (per Space)
    ↓
Memory
    ↓
Tenant platform agent
    ↓
Integrations and responses back out

Threads

Threads are the durable record of work. User chats, integration events, emails, and AgentLoop worker turns all become threads with history, status, metadata, and auditability.

See Threads.

Memory

Memory is the context layer. Hindsight is the hosted durable Brain substrate for user memory, Space memory, retained Brain Sources, recall/reflect, and evidence. Context Engine remains the lazy-loaded external context layer for workspace files, compiled pages, approved MCP tools, and compatibility providers.

In the current open source app, that mainly means:

thread history selected for the context window
Space document retention through Hindsight Brain Sources
long-term user and Space memory recall/reflect through Hindsight by default
context assembly before model invocation

AgentCore managed memory remains available as an explicit compatibility mode for low-cost/development deployments, not as the hosted default.

See Memory.

Agents

Agents are the execution layer. The tenant platform agent receives a thread plus assembled context, decides what to do, calls tools, and produces a response. Per-Space configuration (CONTEXT.md, skills, MCP) is rendered into the workspace the agent runs against.

ThinkWork agent execution runs in AWS-managed AgentCore isolation; desktop and mobile are clients. Pi is the active managed runtime on Bedrock AgentCore. The surrounding harness remains ThinkWork’s: managed does not mean vendor-hosted.

See Agents.

Integrations

Integrations are the integration boundary. Some integrations bring inbound events into threads, and some expose external tools for agents to call.

This includes:

channel and event integrations such as Slack, GitHub, and Google Workspace
tool integrations, including MCP Tools

See Integrations.

Three-tier deployment model

┌─────────────────────────────────────────────────────────────┐
│  App Tier                                                    │
│  AppSync · API Gateway · AgentCore runtimes · Crons · SES   │
│  CloudFront · Integration Lambdas · Step Functions            │
├─────────────────────────────────────────────────────────────┤
│  Data Tier                                                   │
│  Aurora Postgres (pgvector) · S3 (skills, KB, logs)         │
│  Bedrock KB · Secrets Manager                                │
├─────────────────────────────────────────────────────────────┤
│  Foundation Tier                                             │
│  VPC · Subnets · Cognito · KMS · Route53 · ACM · SES Setup  │
└─────────────────────────────────────────────────────────────┘

Foundation tier

The foundation tier provides identity, networking, and encryption. It changes rarely and is the most stable part of the deployment.

Resource	Purpose
VPC + subnets	Isolated network with public and private subnets across 2 AZs
NAT Gateway	Outbound internet access for private subnet resources
Cognito User Pool	User authentication and JWT issuance
Cognito Identity Pool	Maps Cognito users to IAM roles for direct AWS resource access
KMS keys	Encryption at rest for app data, audit logs, and credential vault
Route53 records	DNS for web app, API, and email
ACM certificates	TLS for CloudFront and API Gateway custom domains
SES domain identity	Verified sending domain for outbound email

Data tier

The data tier holds all persistent state. It depends on the foundation tier for network access and KMS encryption.

Resource	Purpose
Aurora Postgres	Primary data store: agents, threads, messages, AgentLoops, Workflows, integrations, users. Also hosts legacy Bedrock KB `pgvector` compatibility state
S3 — skill catalog	`skills/catalog/*.md` — skill packs loaded at invoke time
S3 — knowledge docs	Source documents for Brain Sources and legacy Bedrock KB compatibility
S3 — audit logs	Append-only log of every agent invoke (NDJSON, partitioned by date)
S3 — assets	Admin and end-user app static files
Bedrock Knowledge Base	Legacy vector-indexed document store for compatibility retrieval, backed by Aurora `pgvector`
Secrets Manager	DB credentials, OAuth client secrets

App tier

The app tier is where computation happens. It depends on both lower tiers.

Resource	Purpose
AppSync GraphQL API	Real-time subscriptions (WebSocket), used for streaming responses
API Gateway v2	HTTP queries and mutations, integration webhook ingress
AgentCore runtime	Container-based Pi managed runtime on Bedrock AgentCore
Integration Lambdas	One per integration (GitHub, Google) — handles inbound events
Step Functions	Workflow execution adapter
EventBridge	Trigger plumbing for scheduled AgentLoops, Workflows, and internal cron jobs
Bedrock AgentCore Memory	Explicit compatibility mode for low-cost/development managed memory
ECS Fargate	Hindsight Brain memory service (`enable_hindsight = true` by default)
CloudFront	CDN for web app static files
SES (sending)	Outbound email from agent responses

Data flow: agent invoke

A full round trip from user message to agent response:

1. User sends message
   └─ POST /graphql (API Gateway)
      └─ JWT validated by Cognito authorizer
      └─ createMessage resolver → Aurora (writes message record)
      └─ Triggers the runtime dispatcher

2. Runtime dispatcher resolves the target
   └─ Reads agent/template runtime selector
   └─ Chooses the Pi AgentCore function

3. Selected AgentCore runtime receives event
   └─ Reads thread history from Aurora
   └─ Downloads assigned skill packs from S3
   └─ Uses direct Hindsight recall/reflect for durable user and Space memory
   └─ Uses Context Engine for compiled pages, workspace files, approved MCP tools, and explicit compatibility providers
   └─ After the turn completes, the platform retains learned context into Hindsight
   └─ Builds context: system prompt + selected history + Hindsight memory + lazy-loaded context + tool config

4. Bedrock inference
   └─ Selected runtime sends context + message to Bedrock
   └─ Model may request tool calls
   └─ Tools execute (SQL, S3, HTTP, skill-defined functions)
   └─ Tool results injected, model generates final response

5. Response delivery
   └─ AgentCore writes response message to Aurora
   └─ Publishes AppSync mutation → NewMessageEvent subscription
   └─ Stream chunks published in real time via AppSync
   └─ Thread status updated in Aurora

6. Client receives response
   └─ AppSync WebSocket delivers StreamChunkEvent chunks
   └─ Final NewMessageEvent marks completion

Data flow: integration inbound

External service (Slack, GitHub, etc.)
    → POST /integrations/<id>/webhook (API Gateway)
    → Integration Lambda
        └─ Validates signature
        └─ Writes thread record to Aurora (channel=SLACK, metadata={...})
        └─ Invokes AgentCore (same path as user message above)
    → Agent response
        → Outbound integration or tool call posts reply back to Slack/GitHub/etc.

Data flow: MCP tool integration

Selected AgentCore runtime receives thread + assembled context
    → Resolves enabled tool integrations from template/agent config
    → Connects to MCP server over HTTP streaming or SSE
    → Discovers available tools for this invocation
    → Model calls MCP tool when needed
    → Tool result returns into the same turn
    → Final response written back to the thread

Data flow: external task webhook

External system (Linear, Jira, etc.)
    → POST /integrations/<provider>/webhook (API Gateway)
    → Webhooks Lambda
        └─ Resolves webhook by token (target_type=task, provider id)
        └─ adapter.verifySignature (opt-in, provider-specific)
        └─ adapter.normalizeEvent → NormalizedEvent
        └─ Resolves the connection via providerUserId
        └─ Resolves the per-user MCP token (auto-refresh on expiry)
        └─ adapter.refresh() → live envelope
        │      (synthetic envelope fallback if refresh fails)
        └─ Upserts external-task thread (channel=TASK, metadata.external.latestEnvelope)
        └─ Writes system message (metadata.kind="external_task_event")
        └─ Awaits [notifyNewMessage, notifyThreadUpdate, sendExternalTaskPush]
    → Mobile task card refreshes via AppSync subscription (~2s)
    → Push notification fires only for assignment / status / due changes

Data flow: automation

EventBridge scheduled rule (cron)
    → Step Functions state machine starts
    → Creates AUTO- thread in Aurora
    → Invokes AgentCore with configured prompt
    → (same agent loop as above)
    → Step Functions records execution result
    → Thread marked closed (or failed)

Where state lives

This split is useful to keep in mind:

Threads preserve the canonical record of work in Aurora
Memory combines persisted sources like documents and memories with retrieval-time assembly
Agents are mostly stateless between invocations aside from their configuration
Integrations store credentials and integration configuration, but the resulting work still lands in threads

What	Where	Backup
Agents, threads, messages	Aurora Postgres	Automated daily snapshots (7-day retention)
User accounts	Cognito User Pool	Cognito-managed, multi-AZ
Skill packs	S3 (skill catalog bucket)	S3 versioning enabled
Knowledge documents	S3 (knowledge bucket)	S3 versioning enabled
Audit logs	S3 (audit log bucket)	S3 versioning + lifecycle to Glacier after 90d
Memories (managed)	Aurora Postgres	Same as above
Memories (Hindsight)	Aurora Postgres + ECS in-flight processing	Same as above
OAuth tokens / API keys	SSM Parameter Store (SecureString, KMS)	SSM-managed
Terraform state	S3 (tfstate bucket) + DynamoDB (lock table)	S3 versioning enabled
Vector index	Aurora Postgres (`pgvector`)	Same as Aurora above

Multi-tenancy

ThinkWork is multi-tenant within a single deployment. Every Aurora table has a tenant_id column and all queries are tenant-scoped. The Cognito identity pool maps users to tenants at login time.

Row-level security (Postgres RLS) enforces tenant isolation at the database level — even if application code has a bug, a query cannot return data from another tenant.

-- RLS policy (applied automatically by ThinkWork migrations)
CREATE POLICY tenant_isolation ON threads
  USING (tenant_id = current_setting('app.current_tenant_id')::uuid);

Security model

Boundary	Enforcement
External → API	Cognito JWT (API Gateway authorizer)
API → Aurora	IAM auth + VPC security groups
AgentCore → Bedrock	IAM role (least privilege)
AgentCore → S3	IAM role (read skills, read KB docs, write audit logs)
AgentCore → SSM	IAM role (read credentials for assigned integrations)
Tenant isolation	Postgres RLS on all tables
Secrets at rest	KMS encryption (dedicated key per secret category)
Audit trail	Append-only S3 bucket (no delete permissions on Lambda role)

AgentCore runtime container

ThinkWork deploys the Pi managed runtime as a container image stored in ECR and hosted by Bedrock AgentCore. Invocation routes call the deployed Pi function for the tenant and Space context.

The Pi image includes:

Node.js
Pi runtime loop code
The ThinkWork invocation envelope and response contract
Runtime adapters needed to preserve the same thread, memory, tool, audit, and cost surfaces

Deployments pin the image tag. Upgrading the runtime is a deploy operation, not an admin web setting.