Context engineering platform providing long-term memory, temporal knowledge graphs, Graph RAG, and automated context assembly for AI agents.
Key features
Persistent Long-Term Memory: Stores full chat histories and conversation artifacts persistently to enable recall across long time spans, improving continuity in conversational experiences.
Temporal Knowledge Graph (Graphiti): Builds a temporal knowledge graph with valid_at and invalid_at timestamps to track changing user state, preferences, and relationships over time for accurate contextual reasoning.
Asynchronous Summaries & Artifacts: Automatically generates summaries, classifications, and structured artifacts from messages asynchronously to avoid adding latency to the user chat experience.
Embeddings & Vector Search: Embeds messages and summaries to enable fast semantic search and retrieval of relevant past conversation snippets and business data.
Document Collections: Provides a simple document-collection abstraction for vector search to complement memory features without being a general-purpose vector database.
SDKs & Integrations: Official SDKs for Python, TypeScript/JavaScript, and Go with integrations for LangChain and LlamaIndex to simplify adoption in existing agent stacks.
Managed Cloud Service (Zep Cloud): Offers a managed deployment with low latency, high availability, and additional capabilities like dialog classification and structured data extraction.
Graph RAG & Automated Context Assembly: Combines graph-aware retrieval augmented generation with automated assembly of context from chat history and business data to reduce hallucinations and improve relevance.
Persistent chat history storage and retrieval for AI assistants
Automated generation of summaries and other conversation artifacts
Message and summary embeddings to enable semantic search
Document Collections abstraction for vector/document search
Temporal knowledge graph (Graphiti) with valid_at/invalid_at to track state changes
Automated context assembly for prompt construction (agent memory)
Cloud managed offering (Zep Cloud) with low latency, HA, scalability, dialog classification, and structured data extraction
Official SDKs: Python (zep-cloud / zep-python), TypeScript/JavaScript (@getzep/zep-cloud / zep-js), Go (zep-go)
Asynchronous processing pipeline to avoid blocking user chat experience
Client libraries with features like automatic retries and exponential backoff
Best for
Personalized Conversational Assistants: Maintain long-term user memory so assistants remember user preferences, prior conversations, and context across sessions to deliver personalized responses.
Customer Support with Historical Context: Provide support agents or bots immediate access to past conversation threads, summaries, and structured artifacts to resolve recurring or complex issues faster.
Reducing Hallucinations in LLMs: Use embeddings, graph-aware retrieval, and structured context assembly to ground model responses in verifiable past interactions and business data.
Temporal User Profiling: Track changing user attributes and preferences over time using the temporal knowledge graph to drive targeted recommendations and dynamic personalization.
Agent State Tracking and Change History: Record state transitions with valid/invalid timestamps so agents can reason about when facts were true and how user situations evolved.
Augmenting RAG Workflows: Improve retrieval-augmented generation by assembling relevant chat-derived context and document collections to include only what matters in prompts.
Scaling Memory for Production: Persist conversation data to databases and use Zep Cloud for low-latency, scalable memory services in production AI applications.
Personalized conversational agents that recall historical user interactions
Reducing hallucinations by providing relevant past-context to LLM prompts
RAG workflows combining chat memory and document vectors
Customer support assistants that persist and search prior tickets/conversations
Stateful agents that need to reason about temporal changes in user data or preferences
Analytics and insights from long-term conversation archives
Embedding-based semantic search over conversation content and summaries