Skip to main content

Unit Testing

Unit tests for AgentFlow agents verify that the graph logic, routing, and tool selection work correctly — without making any real LLM API calls. This keeps tests fast, deterministic, and free of external dependencies.

The agentflow.qa.testing module provides three building blocks:

ClassPurpose
TestAgentDrops into any graph node; returns predefined responses
QuickTestOne-liner factory methods for common test scenarios
MockToolRegistryRegisters mock tools and tracks every invocation
TestResultFluent assertions on the graph output

The agentflow test CLI wraps pytest so you can run these tests with a single command.


Installation

pip install pytest pytest-asyncio

agentflow.qa is included with 10xscale-agentflow — no extra install needed.


TestAgent

TestAgent is a drop-in replacement for Agent. It accepts the same constructor arguments but never calls an LLM. Instead it cycles through a list of predefined responses.

Basic usage

from agentflow.qa.testing import TestAgent
from agentflow.core.graph import StateGraph
from agentflow.utils.constants import END

test_agent = TestAgent(
model="test-model",
responses=["The weather in London is sunny."],
)

graph = StateGraph()
graph.add_node("MAIN", test_agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)
app = graph.compile()

Override a node in an existing graph

from agentflow.qa.testing import TestAgent

test_agent = TestAgent(responses=["Mocked response"])
graph.override_node("MAIN", test_agent)
app = graph.compile()

This is the pattern to use when you have a production graph in a separate module and want to swap out only the agent node.

Multiple responses

When an agent is called more than once (for example in a ReAct loop), TestAgent cycles through the responses list:

agent = TestAgent(responses=["Calling tool...", "Final answer here"])
# First invoke → "Calling tool..."
# Second invoke → "Final answer here"
# Third invoke → "Calling tool..." (wraps around)

Simulating tool calls

Pass simulate_tool_calls=True together with a tools list. The first call returns a tool-call message; subsequent calls return the predefined responses.

agent = TestAgent(
responses=["It is 22°C in London."],
tools=["get_weather"],
simulate_tool_calls=True,
)

Assertion helpers

# Assert the agent was called at least once
agent.assert_called()

# Assert exact call count
agent.assert_called_times(2)

# Assert never called
agent.assert_not_called()

# Inspect the last prompt sent to the agent
messages = agent.get_last_messages()

# Reset between test cases
agent.reset()

QuickTest

QuickTest removes the boilerplate of building a graph, compiling it, and invoking it. Every method returns a TestResult.

All QuickTest methods are async — use pytest-asyncio or asyncio.run().

Single-turn test

import pytest
from agentflow.qa.testing import QuickTest


@pytest.mark.asyncio
async def test_greeting():
result = await QuickTest.single_turn(
agent_response="Hello! How can I help?",
user_message="Hi",
)
result.assert_contains("Hello")

Multi-turn conversation

@pytest.mark.asyncio
async def test_conversation():
result = await QuickTest.multi_turn(
conversation=[
("Hello", "Hi there!"),
("What can you do?", "I can answer questions."),
]
)
result.assert_contains("answer questions")

Test with tool calls

@pytest.mark.asyncio
async def test_weather_tool():
result = await QuickTest.with_tools(
query="What is the weather in London?",
response="It is 22°C in London.",
tools=["get_weather"],
tool_responses={"get_weather": "22°C"},
)
result.assert_tool_called("get_weather")
result.assert_contains("22°C")

Custom graph

@pytest.mark.asyncio
async def test_custom_graph():
from agentflow.qa.testing import TestAgent

agent = TestAgent(responses=["Done"])
result = await QuickTest.custom(
agent=agent,
user_message="Run the task",
)
result.assert_contains("Done")
agent.assert_called_times(1)

MockToolRegistry

Use MockToolRegistry when you want to verify tool calls with full control over the mock implementations.

from agentflow.qa.testing import MockToolRegistry
from agentflow.core.graph import ToolNode

tools = MockToolRegistry()

tools.register("get_weather", lambda city: f"22°C in {city}")
tools.register("send_email", lambda to, body: "Sent")

tool_node = ToolNode(tools.get_tool_list())

After running the graph:

# Boolean check
assert tools.was_called("get_weather")

# Call count
assert tools.call_count("send_email") == 1

# Full call history
calls = tools.get_calls("get_weather")
assert calls[0]["kwargs"]["city"] == "London"

# Last call only
last = tools.get_last_call("get_weather")

# Fluent assertions
tools.assert_called("get_weather")
tools.assert_called_with("get_weather", city="London")
tools.assert_call_count("send_email", 1)

Async tools

tools.register_async("search_web", async_search_func)

Reset between tests

# Clear call history, keep registered functions
tools.reset()

# Full reset: clear functions and history
tools.clear()

TestResult

Every QuickTest method returns a TestResult. Its methods all return self for chaining.

result = await QuickTest.single_turn(
agent_response="The capital of France is Paris.",
user_message="What is the capital of France?",
)

(
result
.assert_contains("Paris")
.assert_not_contains("London")
.assert_no_errors()
)
MethodWhat it checks
assert_contains(text)Final response contains text
assert_not_contains(text)Final response does not contain text
assert_equals(expected)Final response equals expected exactly
assert_tool_called(name, **kwargs)Tool name was called; optionally with specific kwargs
assert_tool_not_called(name)Tool name was not called
assert_message_count(n)Total messages in the conversation equals n
assert_no_errors()No error messages in the conversation

The final_response, messages, tool_calls, and state attributes are also available for custom assertions.


agentflow test CLI

agentflow test is a thin pytest wrapper. It reads optional defaults from agentflow.json and forwards any extra arguments straight to pytest.

# Run all tests (pytest auto-discovery)
agentflow test

# Target a specific path
agentflow test tests/unit

# Run with coverage
agentflow test --coverage

# Open the HTML coverage report automatically
agentflow test --coverage --html

# Filter by keyword
agentflow test -k "weather"

# Pass raw pytest flags
agentflow test -- -m "not integration" --tb=short

agentflow.json configuration

{
"agent": "graph.react:app",
"test": {
"path": "tests",
"coverage": true,
"coverage_threshold": 80
}
}
FieldDescription
pathDefault path when no PATH argument is given
coverageEnable coverage on every run
coverage_thresholdMinimum coverage %; run fails if coverage drops below this

A bare agentflow test with the above config is equivalent to:

agentflow test tests --coverage -- --cov-fail-under=80

CI example

# .github/workflows/ci.yml
- name: Run tests
run: agentflow test --coverage

Set coverage_threshold in agentflow.json — no extra flags needed in the workflow.


Complete pytest example

# tests/unit/test_weather_agent.py
import pytest
from agentflow.qa.testing import MockToolRegistry, QuickTest, TestAgent
from agentflow.core.graph import StateGraph, ToolNode
from agentflow.utils.constants import END


@pytest.mark.asyncio
async def test_weather_query_routes_to_tool():
tools = MockToolRegistry()
tools.register("get_weather", lambda city: "22°C")

agent = TestAgent(
responses=["The weather is 22°C."],
tools=tools.get_tool_list(),
simulate_tool_calls=True,
)

graph = StateGraph()
graph.add_node("MAIN", agent)
graph.add_node("TOOL", ToolNode(tools.get_tool_list()))
graph.set_entry_point("MAIN")

def route(state):
last = state.context[-1] if state.context else None
if last and getattr(last, "tools_calls", None):
return "TOOL"
return END

graph.add_conditional_edges("MAIN", route, {"TOOL": "TOOL", END: END})
graph.add_edge("TOOL", "MAIN")

from agentflow.core.state import Message
app = graph.compile()
result = await app.ainvoke({"messages": [Message.text_message("Weather in London?")]})

tools.assert_called("get_weather")
agent.assert_called_times(2) # once for tool call, once for final response


@pytest.mark.asyncio
async def test_single_turn_quick():
result = await QuickTest.single_turn(
agent_response="Paris is the capital of France.",
user_message="Capital of France?",
)
result.assert_contains("Paris")

Further reading