Skip to content

Roadmap

ThinkWork v1 is a working, production-deployable Agent Harness. It is not a finished product. This page is honest about what’s in scope for v1 across all three deployment tiers (ThinkWork open / ThinkWork for Business / ThinkWork Enterprise), what’s explicitly out of scope, and what the roadmap looks like. The harness mechanics — PPAF agent loop, operating guarantees, six-component model — are stable; what continues to evolve is the depth of each component.

These features are complete and supported in v1:

FeatureStatus
Managed agents (AgentCore + Bedrock)Stable
Spaces (per-space rendered workspace, scoped to tenant platform agent)Beta
Pi managed runtime substrate (parallel AgentCore runtime)Beta
Connected agents (BYO runtime webhook)Stable
Agent TemplatesStable
Skill packs (SKILL.md → S3 → invoke-time loading)Stable
Threads + channels (CHAT, AUTO, EMAIL, GITHUB)Stable
AgentCore managed memory with automatic per-turn retention (always on)Stable
Hindsight memory add-on (enable_hindsight = true, ECS Fargate)Beta
Knowledge Bases (Bedrock KB + document upload)Stable
GitHub integrationStable
Google Workspace integration (Gmail + Calendar)Beta
private extension tracker integration (pickup + PR writeback)Beta
Tenant credential vault (KMS-backed provider credentials)Stable
Scheduled automations (EventBridge + Step Functions)Stable
Event-driven automationsBeta
Guardrails (Bedrock Guardrails integration)Stable
Budgets and usage tracking (turn-level)Stable
Audit logging (S3 NDJSON)Stable
Admin web app (React/Vite)Beta
CLI (thinkwork-cli)Beta
Terraform module (thinkwork-ai/thinkwork/aws)Beta
BYO VPC, database, CognitoStable
Evaluations (in-app authoring, runs, scoring)Beta
Multi-tenancy (Postgres RLS)Stable

These features are explicitly not in scope for v1. They’re on the roadmap but don’t have committed timelines yet.

A Postgres-backed knowledge graph with entity extraction, relationship modeling, and an ontology editor UI. This would complement Knowledge Bases (unstructured document RAG) with structured entity graphs for precise, relationship-aware retrieval.

Not in v1 because: The RAG + Knowledge Base path covers 90% of use cases. The graph data model needs more design work before we commit to a schema.

A multi-step research automation that uses Step Functions loops to iteratively search, synthesize, and refine findings. Includes GitHub workspace integration for persisting research artifacts and pluggable measurement functions for evaluating research quality.

Not in v1 because: The single-turn agent loop in v1 handles most research tasks adequately. AutoResearch adds orchestration complexity that requires more production testing.

Per-tenant, per-Space, per-platform-agent, and per-thread cost tracking with dashboards, cost attribution, and chargeback reports. The current implementation tracks token counts and estimates costs per turn but doesn’t fully aggregate them into a reporting-friendly data model.

Not in v1 because: Turn-level tracking is sufficient for most v1 use cases. The reporting layer requires Aurora schema additions and an analytics query layer.

A location intelligence service with Aurora pgvector for geospatial embeddings, Amazon Titan for location embeddings, and a combined SQL + vector search API. Useful for building agents that reason about physical locations, travel, or logistics.

Not in v1 because: It’s a specialized capability that doesn’t belong in the core platform. It’ll ship as an optional module.

A consumer-facing web client for end users to interact with agents (as opposed to the current admin app, which is for platform operators). This would be a white-label React app deployed to CloudFront.

Not in v1 because: Most v1 users are integrating ThinkWork into their own apps via the GraphQL API or the Slack/GitHub integrations. The end-user client is a future UX layer.

An evaluation agent that can perform end-to-end tests using a browser (Nova Act or similar) to test agent-driven workflows that involve web interfaces.

Not in v1 because: Requires browser automation infrastructure (headless Chrome on ECS) and a more sophisticated eval harness. Post-v1 priority.

Browser/computer-use automation (Nova Act or similar) running inside a managed runtime is still future work.

Not in v1 because: It requires hardened browser isolation, profile/credential handling, per-tenant runtime policy, and production evaluation against real web workflows.

Slack ingress (events, slash commands, message actions) is currently disabled while the Spaces-based ingestion path is being designed. Inbound Slack endpoints return 200 OK and drop events without routing — a follow-up brainstorm will define how Slack messages flow into Spaces and the tenant platform agent.

Not in v1 because: The previous Computer-routed Slack dispatch was retired in PR #1666. The replacement design is open.

These are the highest-priority items being actively worked on:

  1. Spaces ingestion — Email, Slack, and webhook ingress flowing into per-Space rendered workspaces
  2. Integration platform expansion — Linear is the current proof; future integration ordering is still being planned
  3. Cost tracking — Aggregated cost reports by tenant, Space, Agent, and time period
  4. Places module — Location intelligence as an optional Terraform module
  5. AutoResearch — Step Functions-based iterative research loop
  6. End-user web client — White-label React app for end users
  7. Eval agent with browser automation — Nova Act or similar for UI-driven agent tests

ThinkWork is Apache 2.0 licensed and open source. Contributions are welcome, especially for:

  • New integration implementations
  • Skill pack library additions
  • Terraform module improvements
  • Test coverage
  • Documentation fixes

See CONTRIBUTING.md in the GitHub repository for development setup and contribution guidelines.

ThinkWork follows Semantic Versioning for the CLI and Terraform module.

  • Patch releases (1.0.x) — Bug fixes, documentation updates, no breaking changes
  • Minor releases (1.x.0) — New features, backwards-compatible. GraphQL schema additions only (no field removals).
  • Major releases (x.0.0) — Breaking changes, with a migration guide

The Terraform module’s input variable interface is considered stable after v1.0. Variables will not be renamed or removed in minor releases — only added.

The GraphQL schema follows the same policy: fields may be added in minor releases but not removed or renamed without a major version bump and a deprecation period of at least 2 minor releases.