Rate Limiting & Abuse Prevention

Protect systems from excessive requests, infinite loops, and denial-of-wallet attacks by restricting requests, tool calls, and execution time per session.

Overview

Rate limiting guardrails enforce quotas on agent activity within a session. Policies are defined in YAML and applied through hooks that run before LLM requests and before tool execution. This prevents runaway agent loops, API throttling violations, and uncontrolled token spend.

Use this guardrail alongside Tool Authorization for comprehensive agent safety in production deployments.

How It Works

Create a session with create_session() to track per-session counters.
Call before_request() before each LLM invocation to enforce request limits.
Call check_tool_call_limit() in a pre-tool hook to peek at tool call quotas.
Call record_tool_call() inside each tool when it actually executes.
Wrap tool execution with start_execution_timer() and end_execution_timer() to enforce time limits.

Configuration

Enable Rate Limiting

yaml

guardrails:
  rate_limit:
    enabled: true
    max_requests_per_session: 5
    max_tool_calls_per_session: 50
    max_tool_execution_seconds: 60

Parameters

Option	Type	Description
`enabled`	bool	Enable rate limiting
`max_requests_per_session`	int	Maximum LLM requests allowed per session
`max_tool_calls_per_session`	int	Maximum tool invocations allowed per session
`max_tool_execution_seconds`	int	Maximum cumulative tool execution time per session (seconds)

Combined with Tool Authorization

Rate limiting and tool authorization can share the same policy file:

yaml

guardrails:
  rate_limit:
    enabled: true
    max_requests_per_session: 5
    max_tool_calls_per_session: 50
    max_tool_execution_seconds: 60

  tool_authorization:
    enabled: true
    denied_tools:
      - execute_shell
    sensitive_tools:
      - delete_record
    roles:
      analyst:
        allowed_tools:
          - search_web
          - calculator

Usage with Agent Hooks

Rate limiting is enforced through GuardrailSystem session hooks. Like tool authorization, it requires integration into your agent graph.

Initialize Guardrails

python

from elsai_guardrails.guardrails import GuardrailPolicy, GuardrailSystem

guardrails = GuardrailSystem(
    guardrail_policy=GuardrailPolicy.from_file("config.yaml"),
)
rate_limit_config = guardrails.guardrail_policy.to_rate_limit_config()

Session Management

python

session = guardrails.create_session()
session_id = session.session_id

# After processing, inspect session metrics
session = guardrails.get_session(session_id)
print(f"requests={session.request_count}  tool_calls={session.tool_call_count}")

Before LLM Request

python

result = guardrails.before_request(session_id, raise_on_block=False)

if not result.passed:
    print(f"Request blocked: {result.error}")
    print(f"Count: {result.current_count}/{result.limit}")
else:
    # Proceed with LLM call
    response = llm.invoke(messages)

Before Tool Execution (Peek)

Check limits without incrementing — the actual tool records the call when it runs:

python

result = guardrails.check_tool_call_limit(session_id, raise_on_block=False)

if not result.passed:
    print(f"Tool blocked: {result.error}")
else:
    would_be = result.current_count + 1
    print(f"Tool allowed ({would_be}/{rate_limit_config.max_tool_calls_per_session})")

Inside Tool Implementation

Record the call and track execution time when the tool actually runs:

python

@tool
def search_web(query: str, session_id: str) -> str:
    guardrails.record_tool_call(session_id)
    t = guardrails.start_execution_timer()
    result = f"Results for: {query}"
    guardrails.end_execution_timer(t)
    return result

LangGraph Integration Pattern

Recommended graph flow:

agent → rate_limit → tools → agent

agent node — Call before_request() before invoking the LLM.
rate_limit node — Call check_tool_call_limit() for each pending tool call before ToolNode runs.
tools node — Each tool calls record_tool_call() and wraps execution in a timer.

When a limit is exceeded, inject a ToolMessage or AIMessage with RATE LIMIT BLOCKED: and route back to the agent.

Example Scenarios

Scenario	Limit	Result
5th request in session	`max_requests_per_session: 5`	✅ Allowed
6th request in session	`max_requests_per_session: 5`	❌ Blocked
Tool call within quota	`max_tool_calls_per_session: 50`	✅ Allowed
Tool call exceeds quota	`max_tool_calls_per_session: 50`	❌ Blocked
Slow tool exceeds time budget	`max_tool_execution_seconds: 60`	❌ Blocked on timer

Best Practices

Create one session per conversation — Use create_session() at the start of each user session and pass session_id through agent state.
Peek before execute — Use check_tool_call_limit() in the pre-tool hook and record_tool_call() inside the tool to avoid counting blocked calls.
Wrap tools with timers — Always pair start_execution_timer() and end_execution_timer() around tool logic for execution time limits.
Combine with tool authorization — Apply both rate limits and permission checks for defense in depth.
Monitor session metrics — Use get_session() to track request and tool call counts for observability and tuning.

Next Steps

Tool Authorization — Control which tools each role can access
Token Budget Enforcement — Limit token usage per request and run
GuardrailSystem — Core API reference
Guardrails Configuration — Full configuration reference

Rate Limiting & Abuse Prevention ​

Overview ​

How It Works ​

Configuration ​

Enable Rate Limiting ​

Parameters ​

Combined with Tool Authorization ​

Usage with Agent Hooks ​

Initialize Guardrails ​

Session Management ​

Before LLM Request ​

Before Tool Execution (Peek) ​

Inside Tool Implementation ​

LangGraph Integration Pattern ​

Example Scenarios ​

Best Practices ​

Next Steps ​

Rate Limiting & Abuse Prevention

Overview

How It Works

Configuration

Enable Rate Limiting

Parameters

Combined with Tool Authorization

Usage with Agent Hooks

Initialize Guardrails

Session Management

Before LLM Request

Before Tool Execution (Peek)

Inside Tool Implementation

LangGraph Integration Pattern

Example Scenarios

Best Practices

Next Steps