How to use the memory store
AgentFlow's memory store provides long-term vector-based memory that persists across threads and sessions. There are two store implementations: QdrantStore (Qdrant vector database) and Mem0Store (Mem0 managed memory). Both implement BaseStore.
A memory store is separate from the checkpointer. The checkpointer preserves conversation history per thread; the store holds facts, user preferences, and knowledge that should be accessible across all threads.
QdrantStore
Install
pip install "10xscale-agentflow[qdrant]"
You also need an embedding service. Both OpenAI and Google embeddings are built in.
Local Qdrant (file-backed)
from agentflow.storage.store import (
QdrantStore,
OpenAIEmbedding,
create_local_qdrant_store,
)
store = create_local_qdrant_store(
path="./qdrant_data",
embedding=OpenAIEmbedding(), # uses OPENAI_API_KEY env var
collection="my_agent_memory",
)
Remote Qdrant server
from agentflow.storage.store import create_remote_qdrant_store, OpenAIEmbedding
store = create_remote_qdrant_store(
host="localhost",
port=6333,
embedding=OpenAIEmbedding(),
collection="my_agent_memory",
)
Qdrant Cloud
from agentflow.storage.store import create_cloud_qdrant_store, GoogleEmbedding
store = create_cloud_qdrant_store(
url="https://xyz.qdrant.io",
api_key="your-qdrant-api-key",
embedding=GoogleEmbedding(), # uses GOOGLE_API_KEY env var
collection="my_agent_memory",
)
Factory reference
All three factories call QdrantStore(...) internally. Use them for clarity.
| Factory | When to use |
|---|---|
create_local_qdrant_store(path, embedding, ...) | Local development, single machine |
create_remote_qdrant_store(host, port, embedding, ...) | Self-hosted Qdrant server |
create_cloud_qdrant_store(url, api_key, embedding, ...) | Qdrant Cloud |
Direct QdrantStore construction
from agentflow.storage.store import QdrantStore, OpenAIEmbedding, DistanceMetric
store = QdrantStore(
embedding=OpenAIEmbedding(),
path="./local_qdrant", # local: provide path
# host="localhost", # remote: provide host + port
# port=6333,
# url="https://...", # cloud: provide url + api_key
# api_key="...",
collection="custom_collection",
distance_metric=DistanceMetric.COSINE, # COSINE | EUCLIDEAN | DOT | MANHATTAN
)
Mem0Store
pip install "10xscale-agentflow[mem0]"
from agentflow.storage.store import Mem0Store, create_mem0_store, create_mem0_store_with_qdrant
# Default Mem0 cloud backend
store = create_mem0_store(api_key="your-mem0-api-key")
# Mem0 with your own Qdrant backend
store = create_mem0_store_with_qdrant(
api_key="your-mem0-api-key",
qdrant_url="https://xyz.qdrant.io",
qdrant_api_key="your-qdrant-api-key",
collection="mem0_collection",
)
Use a store with the graph
Pass the store to compile(). From within node functions it is accessible via the BaseStore dependency injection binding.
app = graph.compile(
checkpointer=checkpointer,
store=store,
)
Embedding options
| Class | Provider | Env var required |
|---|---|---|
OpenAIEmbedding | OpenAI text-embedding-3-small | OPENAI_API_KEY |
GoogleEmbedding | Google text-embedding-004 | GOOGLE_API_KEY |
from agentflow.storage.store import OpenAIEmbedding, GoogleEmbedding
openai_embed = OpenAIEmbedding()
google_embed = GoogleEmbedding()
Agent-level memory with MemoryConfig
MemoryConfig wires memory directly into an Agent node. The agent automatically retrieves relevant memories before each LLM call and can write new memories using injected memory tools.
Minimal MemoryConfig
from agentflow.storage.store import MemoryConfig
from agentflow.storage.store import create_local_qdrant_store, OpenAIEmbedding
store = create_local_qdrant_store("./qdrant_data", OpenAIEmbedding())
agent = Agent(
model="gpt-4o",
memory=MemoryConfig(store=store),
)
Full MemoryConfig reference
from agentflow.storage.store import MemoryConfig, UserMemoryConfig, AgentMemoryConfig, ReadMode
memory = MemoryConfig(
store=store, # required: BaseStore instance
retrieval_mode=ReadMode.POSTLOAD, # POSTLOAD (default) | PRELOAD
limit=5, # max memories to retrieve
score_threshold=0.0, # minimum similarity score (0.0–1.0)
max_tokens=None, # cap total tokens across all memories
inject_system_prompt=True, # prepend memories to system prompt
user_memory=UserMemoryConfig(
enabled=True,
memory_type="episodic",
category="conversations",
user_id=None, # override user_id; falls back to config["user_id"]
limit=5,
score_threshold=0.6,
),
agent_memory=AgentMemoryConfig(
enabled=False, # disabled by default
memory_type="semantic",
category="knowledge",
agent_id="my-agent",
app_id="my-app",
),
)
agent = Agent(model="gpt-4o", memory=memory)
ReadMode
| Mode | Behaviour |
|---|---|
ReadMode.POSTLOAD (default) | Memories are fetched on-demand via injected tools that the LLM can call. The agent decides when to retrieve and write. |
ReadMode.PRELOAD | Memories are retrieved before the LLM call and injected into the system prompt automatically. |
MemoryType and DistanceMetric
These enums are importable from agentflow.storage.store:
from agentflow.storage.store import MemoryType, DistanceMetric
# MemoryType values
MemoryType.EPISODIC # conversation events, session notes
MemoryType.SEMANTIC # facts, user preferences
MemoryType.PROCEDURAL # how-to workflows, step sequences
MemoryType.ENTITY # information about people, places
MemoryType.RELATIONSHIP # how entities relate
MemoryType.DECLARATIVE # explicit stated facts
MemoryType.CUSTOM # domain-specific
# DistanceMetric values
DistanceMetric.COSINE # default; best for text embeddings
DistanceMetric.EUCLIDEAN # absolute vector distances
DistanceMetric.DOT # normalised vectors, high-dimensional spaces
DistanceMetric.MANHATTAN # L1 distance
Complete example: memory-aware agent
import asyncio
from agentflow.core.graph import StateGraph, Agent
from agentflow.core.state import AgentState, Message
from agentflow.storage.checkpointer import InMemoryCheckpointer
from agentflow.storage.store import (
MemoryConfig,
UserMemoryConfig,
create_local_qdrant_store,
OpenAIEmbedding,
)
from agentflow.utils import END
# Set up store
store = create_local_qdrant_store("./qdrant_data", OpenAIEmbedding())
# Configure agent-level memory
memory = MemoryConfig(
store=store,
user_memory=UserMemoryConfig(
enabled=True,
memory_type="semantic",
category="user_prefs",
limit=5,
score_threshold=0.6,
),
)
agent = Agent(
model="gpt-4o",
system_prompt=[{"role": "system", "content": "You are a personalized assistant."}],
memory=memory,
)
graph = StateGraph()
graph.add_node("MAIN", agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)
app = graph.compile(checkpointer=InMemoryCheckpointer(), store=store)
async def main():
# First session: share a preference
await app.ainvoke(
{"messages": [Message.text_message("I prefer concise answers, no more than 3 sentences.")]},
config={"thread_id": "user-1-session-a", "user_id": "user-1"},
)
# New session, new thread_id: memory persists across threads
result = await app.ainvoke(
{"messages": [Message.text_message("How long should your replies be?")]},
config={"thread_id": "user-1-session-b", "user_id": "user-1"},
)
print(result["messages"][-1].content)
asyncio.run(main())
What you learned
QdrantStoreandMem0Storeboth implementBaseStore.- Use
create_local_qdrant_store,create_remote_qdrant_store, orcreate_cloud_qdrant_storeto create aQdrantStore. - Pass
store=storetograph.compile()to make the store available to the graph. MemoryConfigonAgent(..., memory=...)enables automatic retrieval and writing of per-user memories.ReadMode.POSTLOAD(default) lets the LLM decide when to use memory tools;ReadMode.PRELOADinjects memories before every LLM call.
Next steps
- Configure Agent for the full
Agentparameter reference. - Use prebuilt agents for ready-made agents that support
memoryout of the box.