TencentDB Agent Memory — Four-Tier Hierarchical Memory for AI Agents

URL: https://github.com/Tencent/TencentDB-Agent-Memory Maintainer: Tencent License: MIT Stack: TypeScript (83.6%), Python, Shell Backend: SQLite + sqlite-vec (zero config default); Tencent Cloud Vector Database (TCVDB) optional

Core Argument

Brute-force history accumulation bloats context and burns tokens. Irreversible summarization loses traceability. TencentDB Agent Memory rejects both: it uses a four-tier hierarchical pipeline that progressively abstracts raw data upward — while preserving a deterministic path back to ground-truth evidence via node_id and result_ref links at every level.

Four-Tier Memory Pipeline

TierNameContentsStorage
L0ConversationRaw dialogue and execution tracesDatabase
L1AtomExtracted atomic facts and key pointsDatabase + vector embeddings
L2ScenarioScene blocks grouping related patterns across L1 atomsMarkdown
L3PersonaSynthesized user profile and long-term preferencesMarkdown

Agents consult L3/L2 during normal operation; they can drill down to L0 raw evidence when needed. Every abstraction links back to its source — no information is ever permanently collapsed.

Two Pillar Technologies

1. Memory Layering

Heterogeneous storage strategy: raw logs in databases for retrieval robustness; persona and scenario layers in human-readable Markdown for inspectability and token efficiency. Every layer maintains node_id / result_ref pointers for lossless recovery.

2. Symbolic Memory — Mermaid Canvas

Task state is encoded as Mermaid syntax diagrams rather than verbose prose:

  • Only lightweight task maps stay in the context window
  • Verbose intermediate logs are offloaded to external files, referenced by node_id
  • LLMs can parse Mermaid precisely; token cost is a fraction of equivalent natural language

This is the distinctive technical contribution: using a structured diagram language as a compact, LLM-parseable short-term context representation.

Retrieval

Hybrid: BM25 keyword + vector embeddings + RRF (Reciprocal Rank Fusion). Tools exposed: tdai_memory_search, tdai_conversation_search.

Lifecycle

  1. Capture — conversations and tool outputs logged automatically
  2. Extraction (L1) — every N turns: atomic facts extracted, embedded, stored
  3. Aggregation (L2) — periodic: scenario patterns identified across L1 atoms
  4. Personalization (L3) — every N new memories: user persona synthesized/updated
  5. Recall — before each turn: relevant memories injected via hybrid search
  6. Compression (optional) — verbose logs offloaded; task state encoded as Mermaid graphs

Performance (measured on continuous long-horizon sessions, 50 consecutive SWE-bench tasks)

MetricBaselineWith TencentDB MemoryChange
Token usage (WideSearch)221.31M85.64M−61.38%
Task success rate (WideSearch)33%50%+51.52% relative
PersonaMem accuracy48%76%+28 pts

Integrations

  • OpenClaw — plugin-based; automatic memory capture, extraction, recall
  • Hermes — Docker gateway adapter with standalone LLM mode
  • Community support via Discord; issues addressed within 24 hours

Key Takeaways

  • Four-tier pyramid is the most structured memory hierarchy of any entry in this wiki — contrast with Mem0’s flat ADD-only model or Dakera’s four typed buckets (episodic/semantic/procedural/working)
  • Mermaid canvas for symbolic short-term context is genuinely novel: no other reviewed system uses a diagram language as a token-efficient context encoding
  • White-box design (Markdown + Mermaid artifacts at ~/.openclaw/memory-tdai/) is a strong differentiator — memory is inspectable and editable without tooling
  • Lossless traceability via node_id / result_ref solves the core failure mode of summarization-based systems
  • 61% token reduction benchmark is striking but measured on Tencent’s own framework integration (OpenClaw) — independent replication needed
  • MIT + Tencent backing suggests production intent, not just research prototype