How to configure Agent
Agent is the LLM node in a StateGraph. This guide covers every constructor parameter with working examples.
Minimal example
from agentflow.core.graph import Agent
agent = Agent(model="gpt-4o")
AgentFlow auto-detects the provider from the model name. For OpenAI models it uses the openai SDK; for Gemini models it uses google-generativeai.
Model and provider
Auto-detect (recommended)
agent = Agent(model="gpt-4o") # → openai
agent = Agent(model="gemini-2.5-flash") # → google
agent = Agent(model="claude-3-5-sonnet-20241022") # → openai-compatible
Explicit provider with / prefix
agent = Agent(model="openai/gpt-4o")
agent = Agent(model="google/gemini-2.5-flash")
Explicit provider kwarg
agent = Agent(model="gpt-4o", provider="openai")
agent = Agent(model="gemini-2.5-flash", provider="google")
Third-party OpenAI-compatible APIs
# Ollama (local)
agent = Agent(
model="llama3.2",
provider="openai",
base_url="http://localhost:11434/v1",
)
# DeepSeek
agent = Agent(
model="deepseek-chat",
provider="openai",
base_url="https://api.deepseek.com/v1",
)
# OpenRouter
agent = Agent(
model="anthropic/claude-3-5-sonnet",
provider="openai",
base_url="https://openrouter.ai/api/v1",
)
System prompt
Pass a list of message dicts. The most common pattern is a single system role entry.
agent = Agent(
model="gpt-4o",
system_prompt=[{
"role": "system",
"content": "You are a concise assistant. Reply in at most 3 sentences.",
}],
)
State interpolation
Placeholders like {field_name} are replaced at runtime with values from the current AgentState:
from agentflow.core.state import AgentState
class MyState(AgentState):
user_name: str = "Guest"
language: str = "English"
agent = Agent(
model="gpt-4o",
system_prompt=[{
"role": "system",
"content": "You are helping {user_name}. Always reply in {language}.",
}],
)
Tools and ToolNode
Pass a ToolNode instance directly, or reference a graph node by name.
from agentflow.core.graph import ToolNode
def search(query: str) -> str:
"""Search the web."""
return f"Results for: {query}"
tool_node = ToolNode([search])
# Inline: agent owns the tools
agent = Agent(model="gpt-4o", tool_node=tool_node)
# Named reference: ToolNode is a separate graph node
agent = Agent(model="gpt-4o", tool_node="TOOL")
# When using a named reference, add both nodes to the graph
# graph.add_node("MAIN", agent)
# graph.add_node("TOOL", tool_node)
Filter tools by tag
tools_tags limits which tools from the ToolNode are exposed to the LLM. Tools without matching tags are hidden.
from agentflow.utils.decorators import tool
@tool(tags=["safe"])
def safe_search(query: str) -> str:
"""Safe search."""
return "..."
@tool(tags=["admin"])
def admin_action(cmd: str) -> str:
"""Admin-only action."""
return "..."
tool_node = ToolNode([safe_search, admin_action])
# Only expose "safe" tools
agent = Agent(model="gpt-4o", tool_node=tool_node, tools_tags={"safe"})
Reasoning configuration
All providers share a unified reasoning_config dict. Reasoning is on by default at medium effort.
# Default — medium effort (ON for both OpenAI and Google)
agent = Agent(model="gpt-4o")
# High effort
agent = Agent(model="gpt-4o", reasoning_config={"effort": "high"})
# Disable reasoning entirely
agent = Agent(model="gpt-4o", reasoning_config=None)
# OpenAI: low effort + auto summary
agent = Agent(model="o4-mini", reasoning_config={"effort": "low", "summary": "auto"})
# Google: exact thinking_budget (tokens)
agent = Agent(model="gemini-2.5-flash", reasoning_config={"thinking_budget": 5000})
Google effort → thinking_budget mapping:
effort | thinking_budget |
|---|---|
"low" | 512 |
"medium" (default) | 8192 |
"high" | 24576 |
Retry configuration
Agent retries on HTTP 429, 500, 502, 503, and 529 with exponential back-off.
from agentflow.core.graph.agent_internal.constants import RetryConfig
# Default (3 retries, 1s initial, 2x backoff, 30s cap)
agent = Agent(model="gpt-4o")
# Custom retry
agent = Agent(
model="gpt-4o",
retry_config=RetryConfig(
max_retries=5,
initial_delay=2.0,
max_delay=60.0,
backoff_factor=2.0,
),
)
# Disable retries
agent = Agent(model="gpt-4o", retry_config=False)
RetryConfig fields
| Field | Default | Notes |
|---|---|---|
max_retries | 3 | Total attempts = max_retries + 1. |
initial_delay | 1.0 | Seconds to wait before the first retry. |
max_delay | 30.0 | Cap on the delay between retries. |
backoff_factor | 2.0 | Multiplier applied after each retry. |
circuit_breaker_enabled | False | Enable circuit breaker (opt-in). |
circuit_breaker_threshold | 5 | Consecutive failures that open a circuit. |
circuit_breaker_reset_timeout | 30.0 | Seconds the circuit stays open before a half-open trial. |
Circuit breaker
The circuit breaker is an opt-in complement to retries and fallback_models. Once a (provider, model) pair fails circuit_breaker_threshold times in a row, its circuit opens and subsequent calls to that target are skipped immediately (moving straight to the next fallback) for circuit_breaker_reset_timeout seconds. After the cooldown a single trial is allowed; a successful trial closes the circuit, a failed trial re-opens it.
This prevents a dead provider from being retried on every call while other fallbacks are available.
from agentflow.core.graph.agent_internal.constants import RetryConfig
agent = Agent(
model="gpt-4o",
fallback_models=["gpt-4o-mini", ("gemini-2.0-flash", "google")],
retry_config=RetryConfig(
max_retries=3,
circuit_breaker_enabled=True,
circuit_breaker_threshold=5,
circuit_breaker_reset_timeout=30.0,
),
)
Circuit state is per Agent instance and scoped to (provider, model) pairs. Restarting the process resets all circuit state.
LLM call timeout
All LLM clients apply a default request timeout of 600 seconds so a stalled provider cannot hang a graph run indefinitely.
Override globally via environment variable
AGENTFLOW_LLM_TIMEOUT=120 # seconds
Override programmatically
from agentflow.core.llm import set_default_llm_timeout, get_default_llm_timeout
set_default_llm_timeout(120.0) # apply globally from this point on
set_default_llm_timeout(None) # reset to env var / built-in default
Resolution order (first match wins):
- A programmatic override set via
set_default_llm_timeout. - The
AGENTFLOW_LLM_TIMEOUTenvironment variable. - The built-in default of 600 seconds (
DEFAULT_LLM_TIMEOUT_SECONDS).
An explicit timeout= kwarg passed directly to the underlying SDK client still takes precedence over this default.
Fallback models
When the primary model exhausts all retries, AgentFlow tries each fallback in order.
# Same-provider fallback
agent = Agent(
model="gpt-4o",
fallback_models=["gpt-4o-mini"],
)
# Cross-provider fallback
agent = Agent(
model="gemini-2.5-flash",
provider="google",
fallback_models=[
"gemini-2.0-flash", # inherit provider (google)
("gpt-4o-mini", "openai"), # explicit (model, provider) tuple
],
)
Output type
output_type | Use case |
|---|---|
"text" (default) | Text generation |
"image" | Image generation (e.g. DALL-E 3) |
"video" | Video generation |
"audio" | Text-to-speech |
image_agent = Agent(model="dall-e-3", output_type="image")
tts_agent = Agent(model="tts-1", output_type="audio")
Structured output
Use output_schema with a Pydantic model to force JSON output matching a schema. Requires output_type="text".
from pydantic import BaseModel
class ReviewAnalysis(BaseModel):
sentiment: str # "positive" | "negative" | "neutral"
score: float # 0.0 – 1.0
summary: str
agent = Agent(
model="gpt-4o",
output_schema=ReviewAnalysis,
)
The agent's response message will contain a JSON string conforming to ReviewAnalysis.
Extra messages
extra_messages are injected into every LLM call after the system prompt and before the context. Use them for few-shot examples or static instructions that should always appear.
from agentflow.core.state import Message
examples = [
Message.text_message("Q: What is 2+2?", role="user"),
Message.text_message("A: 4", role="assistant"),
]
agent = Agent(
model="gpt-4o",
extra_messages=examples,
)
API style (OpenAI)
api_style selects which OpenAI API surface to use.
# Chat Completions (default)
agent = Agent(model="gpt-4o", api_style="chat")
# Responses API
agent = Agent(model="o4-mini", api_style="responses")
Additional LLM kwargs
Any extra keyword argument is forwarded to the provider SDK.
agent = Agent(
model="gpt-4o",
temperature=0.3,
max_tokens=2048,
top_p=0.9,
)
Complete constructor reference
Agent(
model: str,
output_type: str = "text", # "text" | "image" | "video" | "audio"
system_prompt: list[dict] | None = None,
tool_node: str | ToolNode | None = None,
extra_messages: list[Message] | None = None,
trim_context: bool = False,
tools_tags: set[str] | None = None,
reasoning_config: dict | bool | None = {"effort": "medium"},
skills: SkillConfig | None = None,
memory: MemoryConfig | None = None,
retry_config: RetryConfig | bool | None = True,
fallback_models: list[str | tuple[str, str]] | None = None,
multimodal_config: MultimodalConfig | None = None,
output_schema: type[BaseModel] | None = None,
# kwargs only:
provider: str | None = None, # "openai" | "google"
base_url: str | None = None,
api_style: str = "chat", # "chat" | "responses"
use_vertex_ai: bool = False,
temperature: float | None = None,
max_tokens: int | None = None,
# ...any other provider kwargs
)
What you learned
modelis the only required argument;provideris auto-detected.tool_nodecan be aToolNodeinstance or the name of a graph node.reasoning_config=Nonedisables reasoning; the default enables it at medium effort.retry_configandfallback_modelsmake agents resilient to transient API failures.output_schemaenforces structured JSON output via a Pydantic model.