TencentDB Agent Memory — Four-Tier Hierarchical Memory for AI Agents
URL: https://github.com/Tencent/TencentDB-Agent-Memory Maintainer: Tencent License: MIT Stack: TypeScript (83.6%), Python, Shell Backend: SQLite + sqlite-vec (zero config default); Tencent Cloud Vector Database (TCVDB) optional
Core Argument
Brute-force history accumulation bloats context and burns tokens. Irreversible summarization loses traceability. TencentDB Agent Memory rejects both: it uses a four-tier hierarchical pipeline that progressively abstracts raw data upward — while preserving a deterministic path back to ground-truth evidence via node_id and result_ref links at every level.
Four-Tier Memory Pipeline
| Tier | Name | Contents | Storage |
|---|---|---|---|
| L0 | Conversation | Raw dialogue and execution traces | Database |
| L1 | Atom | Extracted atomic facts and key points | Database + vector embeddings |
| L2 | Scenario | Scene blocks grouping related patterns across L1 atoms | Markdown |
| L3 | Persona | Synthesized user profile and long-term preferences | Markdown |
Agents consult L3/L2 during normal operation; they can drill down to L0 raw evidence when needed. Every abstraction links back to its source — no information is ever permanently collapsed.
Two Pillar Technologies
1. Memory Layering
Heterogeneous storage strategy: raw logs in databases for retrieval robustness; persona and scenario layers in human-readable Markdown for inspectability and token efficiency. Every layer maintains node_id / result_ref pointers for lossless recovery.
2. Symbolic Memory — Mermaid Canvas
Task state is encoded as Mermaid syntax diagrams rather than verbose prose:
- Only lightweight task maps stay in the context window
- Verbose intermediate logs are offloaded to external files, referenced by
node_id - LLMs can parse Mermaid precisely; token cost is a fraction of equivalent natural language
This is the distinctive technical contribution: using a structured diagram language as a compact, LLM-parseable short-term context representation.
Retrieval
Hybrid: BM25 keyword + vector embeddings + RRF (Reciprocal Rank Fusion). Tools exposed: tdai_memory_search, tdai_conversation_search.
Lifecycle
- Capture — conversations and tool outputs logged automatically
- Extraction (L1) — every N turns: atomic facts extracted, embedded, stored
- Aggregation (L2) — periodic: scenario patterns identified across L1 atoms
- Personalization (L3) — every N new memories: user persona synthesized/updated
- Recall — before each turn: relevant memories injected via hybrid search
- Compression (optional) — verbose logs offloaded; task state encoded as Mermaid graphs
Performance (measured on continuous long-horizon sessions, 50 consecutive SWE-bench tasks)
| Metric | Baseline | With TencentDB Memory | Change |
|---|---|---|---|
| Token usage (WideSearch) | 221.31M | 85.64M | −61.38% |
| Task success rate (WideSearch) | 33% | 50% | +51.52% relative |
| PersonaMem accuracy | 48% | 76% | +28 pts |
Integrations
- OpenClaw — plugin-based; automatic memory capture, extraction, recall
- Hermes — Docker gateway adapter with standalone LLM mode
- Community support via Discord; issues addressed within 24 hours
Key Takeaways
- Four-tier pyramid is the most structured memory hierarchy of any entry in this wiki — contrast with Mem0’s flat ADD-only model or Dakera’s four typed buckets (episodic/semantic/procedural/working)
- Mermaid canvas for symbolic short-term context is genuinely novel: no other reviewed system uses a diagram language as a token-efficient context encoding
- White-box design (Markdown + Mermaid artifacts at
~/.openclaw/memory-tdai/) is a strong differentiator — memory is inspectable and editable without tooling - Lossless traceability via
node_id/result_refsolves the core failure mode of summarization-based systems - 61% token reduction benchmark is striking but measured on Tencent’s own framework integration (OpenClaw) — independent replication needed
- MIT + Tencent backing suggests production intent, not just research prototype