Sources and Knowledge Bases

Sources and Knowledge Bases are the authored-source facet inside Memory. A policy document, a product FAQ, a runbook, an API reference — something authoritative that should be a source of truth when the agent answers a question, not a hallucinated guess. The Reliability operating guarantee lives here in a specific form: grounding the agent on a corpus you control means responses are reproducible against a known source, not synthesized from training-set memory the operator can’t audit.

An agent with a knowledge base attached can retrieve the most relevant chunks and include them in Memory context before the model generates a response. The agent does not memorize the documents; it grounds on them.

When to reach for it

Document knowledge is the right fit when:

The canonical answer lives in a document. Product documentation, standard operating procedures, compliance policies — places where “what does the document say?” is the right question.
You want citations. Every retrieved chunk is traceable back to a specific source file, so agents can cite what they used.
The content changes occasionally, not continuously. A runbook updated monthly is a good fit. A feed of live events is not.
You don’t want the agent inventing facts. Grounding reduces hallucination on questions where ground truth exists.

And it’s the wrong fit when:

The useful context is what this agent has learned across work — that’s Memory and compiled pages, not documents.
The content is massive and sparsely relevant — a million-line codebase isn’t a knowledge base; it’s a search target for a different tool.
The question needs reasoning over a graph of entities — document chunks won’t reliably answer “which engineers owned which services in Q3.”

What happens at turn time

When an agent with a knowledge base attached receives a message, the retrieval step runs before model invocation:

User message → embed query → query Bedrock KB → top-K chunks → assemble into context → model call

Concretely:

The user’s message (plus optionally the last few turns for context) is embedded.
Bedrock Knowledge Base retrieves the top-K most-similar chunks from its vector index. K is configurable per knowledge base; the default is 5.
Each returned chunk includes its source file, an excerpt, and a similarity score.
AgentCore assembles those chunks into a structured <knowledge> block in the model’s context.
The model generates a response, and its skill packs can cite chunks back to specific source files.

The retrieval is per-turn — documents updated this morning show up in afternoon responses (after the KB sync runs), no agent redeployment required.

How ingestion works

A knowledge base is populated by syncing from an S3 bucket. You upload source documents (or point to an existing bucket the agent should read), and Bedrock’s ingestion pipeline:

Chunks each document (configurable: fixed-size, semantic, hierarchical).
Embeds each chunk using the KB’s configured embedding model.
Writes the chunks + embeddings into the vector index.

ThinkWork configures the vector index to live in the same Aurora Postgres cluster as the rest of the system, via the pgvector extension. That means:

No separate vector database to run.
An Aurora snapshot restores both your agents’ state and their KB vector index in one shot.
Query latency is low because the KB index is colocated with everything else the turn needs.

Assigning a KB to an agent

Knowledge bases are assigned to the relevant runtime scope in the admin app or via the GraphQL API. In current Space-aware flows, attach focused KBs to the Space that needs them; retrieval queries the allowed KBs and merges the top-K results. Tenant-level inspection and upload flows live in Memory → Knowledge Bases.

The agent template can specify a default KB assignment — new agents created from that template inherit it. That makes fleet-wide KB rotation a single update to the template.

Known limits

KB sync is not real-time. Uploading a new document and immediately asking the agent about it will return “I don’t know” until the next sync runs. Bedrock KBs sync on a configurable interval (often daily for production KBs; on-demand for dev).
Chunk-size is a one-time decision. Rechunking a KB requires a full re-ingestion. Pick a chunking strategy before loading a large corpus.
No cross-KB semantic deduplication. If two KBs both contain the same policy document, retrieval will return duplicates and the model is asked to reconcile.
Retrieval is lexical-and-semantic, not graph-aware. Knowledge-base retrieval can’t answer “which documents cite this one” — that’s a graph question, better served by compiled pages over authored content.

Memory — where authored sources sit relative to memory and compiled pages
Retained Memory — for agent-learned context, distinct from authored documents
Source Routing — how retrieved chunks combine with thread history and recalled memories at turn time
Admin: Knowledge Bases — the operator surface for creating, syncing, and inspecting KBs
Architecture — where KBs fit in the data tier

Under the hood

Backing service. Amazon Bedrock Knowledge Bases. Managed service, provisioned by ThinkWork’s Terraform module.
Vector index. Aurora Postgres + pgvector — no external vector DB. Index rows live in the bedrock_knowledge_base_* tables.
Source storage. S3 bucket per tenant (s3://thinkwork-<stage>-kb-source/), with lifecycle policies per knowledge base.
Embedding model. Configurable per KB; defaults to amazon.titan-embed-text-v2:0 unless overridden in the KB spec.
Retrieval API. bedrock-agent-runtime:Retrieve called from AgentCore during context assembly; results merged into the model’s system prompt via a <knowledge> block.
Sync triggers. Scheduled (via EventBridge) or on-demand (admin “Sync Now” button fires an ingestion job).
Code path. ThinkWork-side orchestration lives under packages/api/src/; the AgentCore-side retrieval integration lives in packages/agentcore-strands/agent-container/.