LLM Memory Systems

Definition

Mechanisms that give LLM agents durable, persistent knowledge that survives between sessions. Without memory, every agent interaction starts from zero — the agent has no knowledge of prior conversations, user preferences, or accumulated context.

Problem Space

  • LLMs are stateless: context window resets on each session
  • RAG (retrieval-augmented generation) addresses knowledge, but not personal/project-specific memory
  • Cloud memory solutions (e.g. OpenAI memory) create privacy and data-governance concerns
  • No standardized protocol for agents to query persistent memory across platforms

Solution Approaches

ApproachExamplesTrade-offs
Cloud-hosted memoryOpenAI memory, Mem0, SupermemoryConvenient; raises privacy/data concerns
Self-hosted infrastructureDakeraPrivate, no vendor lock-in; single binary, ops discipline required
Full agent frameworkLetta (MemGPT)Memory is first-class; agents self-improve via sleep-time compute
Hierarchical pipelineTencentDB Agent Memory4-tier L0–L3 abstraction; Mermaid symbolic context; lossless traceability
Local file-based memorylink-local-llm-memory, Claude Code memoryPrivate, auditable; requires local tooling
Vector databasePinecone, ChromaHigh-recall retrieval; opaque, hard to inspect
Structured wiki/graphLink, Obsidian + MCPHuman-readable, source-backed; higher setup cost

Key Design Dimensions

  • Locality: cloud vs. local
  • Provenance: does each memory record its source?
  • Queryability: free-text search, structured query, or MCP protocol
  • Auditability: can the user inspect and edit stored memories?
  • Agent contract: how do agents write to and read from memory?

Emerging Standard: MCP

The Model Context Protocol (MCP) is becoming the standard interface for agents to access external tools, including memory systems. Tools like Link expose memory as MCP servers, allowing any MCP-compatible agent (Claude, Cursor, Copilot) to read and write memory using a unified protocol.