Supermemory — Memory Layer for AI Agents

URL: https://supermemory.ai/ Type: Managed cloud platform + API / developer SDK Positioning: “Long-term and short-term memory and context infrastructure for AI agents”

Core Argument

AI agents lack persistent context across sessions and users. Supermemory provides a hosted memory API that builds a semantic graph on top of any entity (user, document, project, org) — enabling agents to understand and recall user intent, preferences, and history without the developer building storage infrastructure from scratch.

Architecture

Storage engine: Custom vector graph engine with ontology-aware edges — goes beyond standard cosine similarity by encoding semantic relationships between entities as typed graph edges.

Retrieval: Hybrid search combining vector embeddings + keyword indexing; sub-300ms response time for real-time agent use.

Shared context pool: All three memory types (below) pull from the same pool when scoped to the same containerTag (user/entity ID).

Three Memory Types

Type	Description
Memory API	Extracts and evolves user-specific facts in real-time; handles knowledge updates, temporal drift, and forgetting
User Profiles	Combines static (always-known) and dynamic (episodic, recent conversation) data into a user model
RAG System	Full advanced metadata filtering + contextual chunking for document-grounded retrieval

Data Ingestion

Multi-format processing: text, conversations, PDFs, images, documents, videos. Pre-built connectors for:

Notion, Slack, Gmail, Google Drive, S3

Integration Surface

TypeScript + Python SDKs
REST API (OpenAPI spec available)
MCP server integration
Developer console for API key management

Use Cases

Enterprise API backends needing persistent user context
Developer plugins requiring memory across sessions
Personal productivity apps (persistent digital memory)
Any agent that must understand a specific user over time

Key Takeaways

Ontology-aware graph edges are the core differentiator vs. plain vector stores — they encode why two pieces of memory are related, not just that they are similar
Three-mode memory API (facts / profiles / RAG) covers the main retrieval patterns in one platform
Cloud-only; no self-hosted option documented — same privacy trade-off as mem0-memory-layer, contrasted with local-first link-local-llm-memory
containerTag scoping model is a clean abstraction: one API handles per-user, per-document, or per-org memory with the same calls
Sub-300ms retrieval makes it viable for synchronous, low-latency agent pipelines

AI Wiki — martplus

Explorer

Supermemory: Memory Layer for AI Agents