Agent Memory Systems: Comparative Overview
Six systems reviewed, covering the full design space: local file-based, cloud API, self-hosted infrastructure, full agent framework, and hierarchical pipeline approaches.
Quick Reference
| System | Model | Deployment | License | Stars |
|---|---|---|---|---|
| Link | Local file / MCP server | Local only | — | — |
| Mem0 | Cloud API / SDK | Cloud + self-hosted | Open-source | 55.7k |
| Supermemory | Cloud API | Cloud only | Proprietary | — |
| Dakera | Self-hosted binary | Self-hosted (alpha) | Open-core (MIT SDKs) | — |
| Letta | Agent framework | Local + cloud | Apache-2.0 | 22.7k |
| TencentDB | Library / plugin | Local (SQLite default) | MIT | — |
Architectural Approach
Each system makes a different foundational bet on where memory should live and how it should be structured.
Link — Local Wiki + MCP
Memory as a human-readable knowledge base. Raw sources → wiki pages → MCP query packets. Every memory is source-backed and inspectable. Agents query via MCP protocol; no cloud dependency. Best fit when auditability and data sovereignty are non-negotiable.
Signature idea: Three-layer architecture (raw → wiki → MCP packets) where provenance is first-class.
Mem0 — Cloud Memory API
Plug-in memory for any existing agent. Hybrid retrieval (semantic + BM25 + entity). ADD-only memory model — facts accumulate without deletion. April 2026 algorithm update added temporal re-ranking. Python/JS SDK makes it the lowest-friction option to adopt.
Signature idea: ADD-only accumulation with entity linking; strongest benchmark scores (91.6 LoCoMo, 94.8 LongMemEval).
Supermemory — Vector Graph API
Cloud-hosted memory with an ontology-aware graph engine — typed edges between entities, not just similarity scores. Three memory modes (Memory API, User Profiles, RAG) all sharing one context pool via containerTag scoping. Sub-300ms retrieval.
Signature idea: Ontology-aware graph edges encode why memories are related, not just that they are similar.
Dakera — Self-Hosted Single Binary
Collapses the typical 3–5-service agent memory stack into one 44MB Rust binary with built-in Candle embeddings (no OpenAI API required). Four agent-specific memory types (episodic, semantic, procedural, working) with automatic importance decay. Raft consensus clustering for horizontal scale.
Signature idea: Built-in inference via Candle — embeddings never leave the server; zero external API dependencies.
Letta — Full Agent Framework (MemGPT)
Not a memory API to bolt onto agents — it is the agent. Agents have a stable UUID, persistent core/archival/recall memory blocks, and improve continuously via sleep-time compute: background dream agents refine prompts and consolidate memories during idle time. Based on Berkeley MemGPT research.
Signature idea: Sleep-time compute — memory improvement happens asynchronously, not just at retrieval time.
TencentDB Agent Memory — Hierarchical Pipeline
Four-tier progressive abstraction: L0 raw logs → L1 atomic facts → L2 scenario patterns → L3 user persona. Short-term task context encoded as Mermaid diagrams rather than prose — compact, LLM-parseable, token-efficient. Full traceability: every persona links back to raw evidence via node_id.
Signature idea: Mermaid canvas as symbolic short-term context; lossless traceability from abstraction to raw source.
Key Design Dimensions
Memory Mutability
| System | Model |
|---|---|
| Mem0 | ADD-only — facts accumulate, nothing deleted |
| Link | Editable wiki — human in the loop |
| Supermemory | Evolving (Memory API handles temporal drift) |
| Dakera | Decay — importance weight degrades over time |
| Letta | Active management — agents read/write/search their own blocks |
| TencentDB | Progressive abstraction — raw logs kept; higher layers synthesized |
Decay (Dakera) and active self-management (Letta) are architecturally sounder for long-lived agents than ADD-only (Mem0), because stale facts eventually stop surfacing or get consolidated rather than accumulating indefinitely.
Privacy & Data Ownership
| System | Data leaves your infra? |
|---|---|
| Link | Never — local only |
| Dakera | Never — self-hosted, built-in embeddings |
| TencentDB | Never — SQLite local default |
| Letta | Optional — self-hosted or Letta Cloud |
| Mem0 | Yes (cloud) or optional (self-hosted server) |
| Supermemory | Yes — cloud only |
Inspectability (can a human read the stored memory?)
| System | Inspectability |
|---|---|
| Link | High — Markdown wiki pages |
| TencentDB | High — Markdown + Mermaid files at ~/.openclaw/memory-tdai/ |
| Letta | Medium — “Memory Palace” UI; structured blocks |
| Dakera | Low — binary engine, dashboard available |
| Mem0 | Low — cloud opaque |
| Supermemory | Low — cloud opaque |
Retrieval Strategy
| System | Method |
|---|---|
| Mem0 | Semantic + BM25 + entity linking |
| Supermemory | Vector graph (ontology-aware) + keyword |
| Dakera | HNSW + BM25 + temporal re-ranking |
| TencentDB | BM25 + vector + RRF fusion |
| Letta | Semantic search over archival/recall blocks |
| Link | MCP query packets (agent-facing) |
When to Use What
Use Link when you want a local, auditable knowledge base that human and agent can both read and edit. Best for personal/team wikis where provenance matters more than latency.
Use Mem0 when you need the fastest path to adding memory to an existing agent codebase. Best benchmark scores; Python/JS SDK; cloud handles ops. Accept the ADD-only model and cloud privacy trade-off.
Use Supermemory when relationship structure between memory items matters — the graph engine with typed edges gives richer retrieval than flat vector stores. Cloud-only, so accept the privacy trade-off.
Use Dakera when you want Mem0-like features but self-hosted and with no external API dependencies at all. Single binary simplicity is the key win; still alpha, so evaluate benchmarks independently.
Use Letta when memory should be the agent’s own responsibility, not an external service. The right choice for long-lived, personalized agents where sleep-time compute and continuous self-improvement are design goals. More opinionated — you build with Letta, not add it to an existing agent.
Use TencentDB Agent Memory when you need maximum token efficiency and full traceability in long-horizon, multi-session tasks. The Mermaid symbolic context encoding is uniquely suited to complex task pipelines (e.g., SWE-bench-style coding agents). Best choice if white-box debuggability is a requirement.
Maturity & Risk
| System | Status | Risk |
|---|---|---|
| Mem0 | Production | Low — 55.7k stars, 319 releases, strong benchmarks |
| Letta | Production | Low — Apache-2.0, Berkeley research provenance, 22.7k stars |
| Link | Stable | Low — local-only, no external deps |
| TencentDB | Active dev | Medium — MIT + Tencent backing, but benchmarks self-reported |
| Supermemory | Production | Medium — cloud-only; vendor lock-in risk |
| Dakera | Public alpha | High — strong claims, independent verification needed |