Letta — Stateful Agent Framework with Persistent Memory

URL: https://letta.com/ (redirects from letta.ai) GitHub: https://github.com/letta-ai/letta (22.7k stars, Apache-2.0) Formerly: MemGPT — research from UC Berkeley Sky Computing Lab Stack: Python (99.5%), Docker, Alembic migrations; CLI via Node.js 18+

Core Argument

Standard LLM APIs are stateless — each session starts from zero. Letta’s position is that memory shouldn’t be bolted onto agents as an afterthought; agents themselves should be the persistent unit. A Letta agent has a stable identity (UUID), its own memory, and improves continuously over time — even while idle.

Critical Distinction from Memory APIs

Mem0, Supermemory, and Dakera are memory APIs you plug into your own agents. Letta is a full agent framework where statefulness and memory are first-class primitives — you build agents with Letta, not add memory to agents built elsewhere.

Memory Architecture

Agents have structured memory blocks:

BlockDescription
Core memoryAlways in-context: agent persona and user model
Archival memoryLong-term, searchable storage for facts beyond context window
Recall memoryConversation history; queryable via search

Agents manage their own memory actively — they can read, write, and search their memory blocks as tool calls during inference, rather than relying on external retrieval pipelines.

Sleep-Time Compute (“Dream Agents”)

The key research contribution: background agents run continuously during idle periods, refining prompts, consolidating memories, and improving skills without blocking live inference. Based on the MemGPT paper’s insight that agents can use off-peak compute to self-improve — analogous to how humans consolidate memories during sleep.

Agent Identity & Portability

  • Each agent gets a stable UUID and persists as a server-side object
  • Memory, state, and learned behavior travel with the agent
  • Agents can be moved between devices and AI model providers without losing context
  • “Memory Palace” UI visualizes what the agent currently knows

Deployment Modes

ModeDetails
Letta Code CLInpm install @letta-ai/letta-code; local agent runner
Desktop appmacOS, Windows, Linux
Self-hosted serverDocker + Python; full REST API
Letta CloudManaged hosting; bring-your-own API keys or use Letta’s
SDKsPython + TypeScript

Integrations

  • MCP protocol support
  • Built-in tools: web search, webpage fetch
  • Works with any LLM provider via API keys

Key Takeaways

  • Sleep-time compute is the architectural idea that separates Letta from all other entries in this wiki — memory improvement happens asynchronously, not just at retrieval time
  • Full agent framework vs. memory API is the decisive framing choice: Letta owns the agent lifecycle; other tools (Mem0, Supermemory, Dakera) are components you compose
  • Apache-2.0 license + UC Berkeley provenance gives it strong research credibility and commercial friendliness
  • MemGPT origins mean the memory architecture is grounded in published research, not just engineering intuition
  • 22.7k stars suggests significant developer adoption despite being a more opinionated framework than drop-in memory APIs