Letta — Stateful Agent Framework with Persistent Memory
URL: https://letta.com/ (redirects from letta.ai) GitHub: https://github.com/letta-ai/letta (22.7k stars, Apache-2.0) Formerly: MemGPT — research from UC Berkeley Sky Computing Lab Stack: Python (99.5%), Docker, Alembic migrations; CLI via Node.js 18+
Core Argument
Standard LLM APIs are stateless — each session starts from zero. Letta’s position is that memory shouldn’t be bolted onto agents as an afterthought; agents themselves should be the persistent unit. A Letta agent has a stable identity (UUID), its own memory, and improves continuously over time — even while idle.
Critical Distinction from Memory APIs
Mem0, Supermemory, and Dakera are memory APIs you plug into your own agents. Letta is a full agent framework where statefulness and memory are first-class primitives — you build agents with Letta, not add memory to agents built elsewhere.
Memory Architecture
Agents have structured memory blocks:
| Block | Description |
|---|---|
| Core memory | Always in-context: agent persona and user model |
| Archival memory | Long-term, searchable storage for facts beyond context window |
| Recall memory | Conversation history; queryable via search |
Agents manage their own memory actively — they can read, write, and search their memory blocks as tool calls during inference, rather than relying on external retrieval pipelines.
Sleep-Time Compute (“Dream Agents”)
The key research contribution: background agents run continuously during idle periods, refining prompts, consolidating memories, and improving skills without blocking live inference. Based on the MemGPT paper’s insight that agents can use off-peak compute to self-improve — analogous to how humans consolidate memories during sleep.
Agent Identity & Portability
- Each agent gets a stable UUID and persists as a server-side object
- Memory, state, and learned behavior travel with the agent
- Agents can be moved between devices and AI model providers without losing context
- “Memory Palace” UI visualizes what the agent currently knows
Deployment Modes
| Mode | Details |
|---|---|
| Letta Code CLI | npm install @letta-ai/letta-code; local agent runner |
| Desktop app | macOS, Windows, Linux |
| Self-hosted server | Docker + Python; full REST API |
| Letta Cloud | Managed hosting; bring-your-own API keys or use Letta’s |
| SDKs | Python + TypeScript |
Integrations
- MCP protocol support
- Built-in tools: web search, webpage fetch
- Works with any LLM provider via API keys
Key Takeaways
- Sleep-time compute is the architectural idea that separates Letta from all other entries in this wiki — memory improvement happens asynchronously, not just at retrieval time
- Full agent framework vs. memory API is the decisive framing choice: Letta owns the agent lifecycle; other tools (Mem0, Supermemory, Dakera) are components you compose
- Apache-2.0 license + UC Berkeley provenance gives it strong research credibility and commercial friendliness
- MemGPT origins mean the memory architecture is grounded in published research, not just engineering intuition
- 22.7k stars suggests significant developer adoption despite being a more opinionated framework than drop-in memory APIs