Skip to content

Memory Graph Direction

When people talk about “agent knowledge graphs,” they usually mean a typed-relationship graph: entities connected by named edges with semantics — Person worksAt Company, Dish servedAt Restaurant. ThinkWork now approaches that through two layers:

  • the existing compiled-page graph: page links, aliases, co-mentions, parent/child page relationships, and memory graph views;
  • the governed Business Ontology: approved entity types, relationship types, facet templates, mappings, and reprocess jobs that shape tenant-shared Brain materialization.

Those layers are related, but not identical. The page graph gives traceability and navigation. The ontology gives the tenant a reviewed business vocabulary before typed semantics affect page templates or agent retrieval.

The compounding-memory pipeline produces a real graph. Every entity or topic page is a node; every link between pages is an edge; and the pipeline writes two kinds of edges:

  • reference edges — “Taberna dos Mercadores references Lisbon.” Written by the leaf planner when a memory mentions both pages, and by two deterministic emitters that run after the leaf pass:
    • emitDeterministicParentLinks — city and journal clusters get references from every leaf page sourced by that cluster.
    • emitCoMentionLinks — entity pages that share a source memory get reciprocal reference edges (capped at 10 directed edges per memory).
  • parent_of / child_of edges — the parent hierarchy, written when a section on one page gets promoted into its own page. “Austin” becomes parent of “Austin Activities,” which can later become parent of “Austin Kid-Friendly Activities.”

Every edge is stored in wiki_page_links with a context tag recording what caused it (deterministic:city:<slug>, co_mention:<memory_unit_id>, or the planner’s rationale). That tag means operators can roll back any class of edges without touching the others.

The page graph is queryable today — wiki_page_links indexes on both endpoints, and the admin app’s wiki browser renders it as a graph visualization.

The legacy page-link graph still uses compact link kinds: reference, parent_of, and child_of. A reference edge knows that page A references page B; it does not necessarily know how. “Taberna dos Mercadores → Lisbon” and “Chef João → Taberna dos Mercadores” can both be references even though one reads like “located in” and the other like “works at.”

Typed meaning now starts in the ontology rather than in ad hoc link rows. Approved relationship types define the tenant’s business predicates, and approved facet templates tell materialization how to shape entity pages. Reprocess jobs then apply those definitions to the tenant-shared Brain, while the existing wiki_page_links table remains compatible with owner-scoped compiled pages and current graph navigation.

The direction from here is additive:

  • Edge type column. wiki_page_links.edge_type becomes a richer enum with values like located_in, works_at, owns, authored, member_of, mentions, derived_from.
  • Structured properties on edges. (A located_in B, since=2024-Q3, confidence=0.8) — not every edge needs this, but for dated relationships and hedged claims it matters.
  • Planner-written types. Future planners can emit ontology-backed relationship slugs alongside source memory citations instead of inventing predicates freely.
  • Retrieval surfaces that use types. The Brain Context provider and split query_brain_context tool already give agents an ontology-shaped business retrieval surface. Future graph-specific retrieval can build on approved relationship definitions.

None of this rewrites what’s already shipped. Current reference edges stay; governed typed semantics layer on top.

Shipping the graph before the ontology wasn’t accidental. Typed edges in an agent system are hard to get right in isolation — the model invents predicates that look useful on paper but don’t hold up in recall, or it commits to an ontology that doesn’t match the data’s real shape. Shipping untyped edges first gave us the corpus to measure against before committing to a type vocabulary. The deterministic emitters, in particular, were built so we could measure link density and recall quality before asking the planner to commit to semantics.

Today’s graph, concretely:

  • Edge storage. wiki_page_links(from_page_id, to_page_id, kind, context, created_at) with a unique index on (from_page_id, to_page_id, kind) — replays are idempotent.
  • Edge kinds shipped. reference, parent_of, child_of.
  • Edge context tags. deterministic:city:<slug>, deterministic:journal:<slug>, co_mention:<memory_unit_id>, plus planner-emitted free-text rationale.
  • Ontology storage. ontology_entity_types, ontology_relationship_types, ontology_facet_templates, ontology_external_mappings, ontology_change_sets, and ontology_reprocess_jobs carry the governed typed vocabulary and application ledger.
  • Density metrics. links_written_deterministic, links_written_co_mention, and duplicate_candidates_count on every compile job — see docs/metrics/wiki-link-density.md for the baseline snapshots.
  • Rollback path. Single DELETE ... WHERE context LIKE 'deterministic:%' drops every deterministic edge without touching planner-written references. Same pattern for co-mention edges via their co_mention: prefix.