Skip to main content

Rate limiting reference

The rate_limit block in agentflow.json activates AgentFlow's built-in sliding-window rate limiter. The limiter is disabled by default — remove the block or set it to null to turn it off.

Configuration fields

FieldTypeDefaultDescription
enabledbooleantrueEnables the middleware when the rate_limit block exists. Set to false to temporarily disable without removing the block.
backendstring"memory"Counter storage. "memory", "redis", or "custom".
requestsinteger100Maximum requests allowed within each window.
windowinteger60Window size in seconds.
bystring"ip"Scope of the limit. "ip" for per-client limits; "global" for one shared quota.
exclude_pathsstring array[]Request paths that bypass rate limiting entirely.
trusted_proxy_headersbooleanfalseUse X-Forwarded-For as the client IP. Only enable behind a proxy that strips this header from untrusted clients.
redis.urlstringnullRedis connection URL. Required for the "redis" backend. Supports ${ENV_VAR} expansion.
redis.prefixstring"agentflow:rate-limit"Key prefix used for all Redis entries.
fail_openbooleantrueWhen true, requests are allowed if the Redis backend is unreachable. When false, they are denied. Only applies to the "redis" backend.

Minimal example

{
"agent": "graph.react:app",
"rate_limit": {
"enabled": true,
"backend": "memory",
"requests": 100,
"window": 60,
"by": "ip",
"exclude_paths": ["/health", "/docs", "/redoc", "/openapi.json"]
}
}

Full Redis example

{
"agent": "graph.react:app",
"rate_limit": {
"enabled": true,
"backend": "redis",
"requests": 1000,
"window": 60,
"by": "ip",
"trusted_proxy_headers": true,
"exclude_paths": ["/health", "/metrics", "/docs", "/redoc", "/openapi.json"],
"redis": {
"url": "${RATE_LIMIT_REDIS_URL}",
"prefix": "agentflow:rate-limit"
},
"fail_open": true
}
}
# .env
RATE_LIMIT_REDIS_URL=redis://localhost:6379/0

Install the Redis extra before using "backend": "redis":

pip install "10xscale-agentflow-cli[redis]"

Backend comparison

BackendWhen to use
memoryLocal development, tests, demos, single-process services
redisProduction: Gunicorn/Uvicorn with multiple workers, Docker/Kubernetes
customCustom storage, external quota services, non-standard enforcement

Response headers

Every response includes rate-limit headers:

HeaderDescription
X-RateLimit-LimitConfigured request limit
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp for the window reset estimate
X-RateLimit-Reset-AfterSeconds until the window resets
Retry-AfterPresent on 429 responses only

429 response body

{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests. Limit: 100 per 60s. Retry after 12s.",
"limit": 100,
"window_seconds": 60,
"retry_after_seconds": 12
},
"metadata": {
"request_id": "request-id",
"status": "error"
}
}

Custom backend interface

from agentflow_cli.src.app.core.middleware.rate_limit import (
BaseRateLimitBackend,
RateLimitDecision,
)


class MyRateLimitBackend(BaseRateLimitBackend):
async def check(self, key: str, *, limit: int, window: int) -> RateLimitDecision:
allowed = True
remaining = limit - 1
reset_after = window
return RateLimitDecision(
allowed=allowed,
remaining=remaining,
reset_after=reset_after,
)

async def close(self) -> None:
return None

Set "backend": "custom" in agentflow.json and bind the instance through InjectQ.

See also