GuardrailSystem

The GuardrailSystem class provides core guardrail functionality for validating text content.

Overview

GuardrailSystem performs safety checks on text content, including:

Toxicity detection
Sensitive data detection
Content classification (jailbreak, malicious content, etc.)

Initialization

Basic Initialization

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

config = GuardrailConfig()
guardrail = GuardrailSystem(config=config)

With Custom Configuration

python

config = GuardrailConfig(
    check_toxicity=True,
    check_sensitive_data=True,
    check_semantic=True,
    toxicity_threshold=0.7,
    block_toxic=True,
    block_sensitive_data=True
)
guardrail = GuardrailSystem(config=config)

From Config File

python

guardrail = GuardrailSystem.from_config("config.yml")

From GuardrailPolicy File

For agent hook features (tool authorization, rate limiting):

python

from elsai_guardrails.guardrails import GuardrailSystem
from elsai_guardrails.guardrails.guardrail_policy import GuardrailPolicy

guardrails = GuardrailSystem(
    guardrail_policy=GuardrailPolicy.from_file("config.yaml"),
)

With Input/Output Checks

python

guardrail = GuardrailSystem(
    config=config,
    input_checks=True,
    output_checks=True
)

Methods

check_input()

Check user input through guardrails. This method is designed for validating user input before it's sent to an LLM.

python

result = guardrail.check_input("user input text")

Parameters:

user_input (str): User input to validate

Returns:

GuardrailResult: Result object

Note: Adds "Please rephrase your input." to message if check fails.

Example - Separate Input Check:

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

# Create guardrail system (no LLM needed for separate checks)
guardrail = GuardrailSystem.from_config("config.yml")

# Step 1: Check user input
user_input = "What is machine learning?"
input_result = guardrail.check_input(user_input)

if not input_result.passed:
    print(f"Input blocked: {input_result.message}")
    # Do not proceed to LLM
    return

# Step 2: Input passed, now call your LLM manually
# (Your LLM call here)

# Step 3: Check LLM output
output_result = guardrail.check_output(llm_response)

check_output()

Check LLM output through guardrails. This method is designed for validating LLM-generated responses before returning them to users.

python

result = guardrail.check_output("LLM generated response")

Parameters:

llm_output (str): LLM output to validate

Returns:

GuardrailResult: Result object

Note: Adds "LLM response has been blocked." to message if check fails.

before_tool_call()

Authorize a tool call before execution. Requires GuardrailPolicy with tool_authorization enabled.

python

result = guardrails.before_tool_call(
    tool_name="delete_record",
    user_role="engineer",
    metadata={"approved": True},
    raise_on_block=False,
)

Returns: ToolCallCheckResult with passed, tool_name, user_role, and error fields.

See Tool Authorization for details.

before_request()

Check per-session request limits before an LLM call. Requires GuardrailPolicy with rate_limit enabled.

python

result = guardrails.before_request(session_id, raise_on_block=False)

See Rate Limiting for details.

check_tool_call_limit()

Peek at tool call quota without incrementing the counter. Call record_tool_call() inside the tool when it actually runs.

python

result = guardrails.check_tool_call_limit(session_id, raise_on_block=False)

record_tool_call()

Record a tool invocation against the session quota. Call from inside the tool implementation.

python

guardrails.record_tool_call(session_id)

start_execution_timer() / end_execution_timer()

Track cumulative tool execution time for rate limiting:

python

t = guardrails.start_execution_timer()
# ... tool logic ...
guardrails.end_execution_timer(t)

create_session() / get_session()

Manage rate-limit session state:

python

session = guardrails.create_session()
session_id = session.session_id

# Later
session = guardrails.get_session(session_id)
print(session.request_count, session.tool_call_count)

link_arms() / link_run_context()

Align guardrail storage with an ARMS run. Required when storage.enabled: true.

python

# With elsai_arms
guardrails.link_arms(arms)

# Or explicit ids
guardrails.link_run_context(
    run_id="run-1",
    project_id="project-1",
    project="my-app",
)

See ARMS Storage for details.

begin_run() / end_run() / storage_run_context()

Manage the storage run lifecycle. Events buffer in memory until end_run() flushes to the Backend.

python

guardrails.begin_run(session_id="sess-1")
guardrails.check_input("Hello")
guardrails.end_run()

# Or use a context manager
with guardrails.storage_run_context(session_id="sess-1"):
    guardrails.check_output("Response text")

Example - Separate Output Check:

python

# After receiving LLM response
llm_response = "Generated response from LLM..."

# Check output
output_result = guardrail.check_output(llm_response)

if not output_result.passed:
    print(f"Output blocked: {output_result.message}")
    # Do not return this response to user
    # Consider sanitizing or asking LLM to regenerate
    return

# Check exfiltration details when enabled
if output_result.exfiltration:
    print(f"Exfiltration action: {output_result.exfiltration['action']}")

# Output passed, safe to return to user
print(llm_response)

Example Usage

Basic Text Check

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

config = GuardrailConfig(
    check_toxicity=True,
    check_sensitive_data=True,
    check_semantic=True
)
guardrail = GuardrailSystem(config=config)

# Check text
result = guardrail.check_input("Hello, how are you?")
if result.passed:
    print("Text passed all checks")
else:
    print(f"Text blocked: {result.message}")

Separate Input and Output Checks

This pattern gives you full control over the LLM call while still having guardrails:

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

# Create guardrail system (no LLM needed)
guardrail = GuardrailSystem.from_config("config.yml")

def process_user_query(user_input: str):
    """Complete workflow: input check -> LLM call -> output check"""
    
    # Step 1: Check user input
    input_result = guardrail.check_input(user_input)
    if not input_result.passed:
        return {
            'success': False,
            'error': f"Input blocked: {input_result.message}",
            'blocked_at': 'input'
        }
    
    # Step 2: Call your LLM (input passed)
    try:
        # Replace with your actual LLM call
        from openai import OpenAI
        client = OpenAI(api_key="your-key")
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": user_input}]
        )
        llm_response = response.choices[0].message.content
    except Exception as e:
        return {
            'success': False,
            'error': f"LLM error: {str(e)}",
            'blocked_at': 'llm_call'
        }
    
    # Step 3: Check LLM output
    output_result = guardrail.check_output(llm_response)
    if not output_result.passed:
        return {
            'success': False,
            'error': f"Output blocked: {output_result.message}",
            'blocked_at': 'output'
        }
    
    # Step 4: Success
    return {
        'success': True,
        'response': llm_response
    }

# Usage
result = process_user_query("What is AI?")
if result['success']:
    print(f"Response: {result['response']}")
else:
    print(f"Blocked at {result['blocked_at']}: {result['error']}")

Input Validation Only

python

user_input = "My email is user@example.com"
result = guardrail.check_input(user_input)

if not result.passed:
    print(f"Input blocked: {result.message}")
    print(f"Reason: {result.sensitive_data.get('predicted_labels', [])}")
    # Ask user to remove sensitive information

Output Validation Only

python

llm_output = "Here is the response..."
result = guardrail.check_output(llm_output)

if not result.passed:
    print(f"Output blocked: {result.message}")
    # Do not return to user, consider sanitizing or regenerating

Configuration Options

See GuardrailConfig for all configuration options.

Result Object

The check_text(), check_input(), and check_output() methods return a GuardrailResult object. See GuardrailResult for details.

Next Steps

LLMRails - High-level API with LLM integration
GuardrailResult - Understanding results
Configuration Guide - Configuration options

GuardrailSystem ​

Overview ​

Initialization ​

Basic Initialization ​

With Custom Configuration ​

From Config File ​

From GuardrailPolicy File ​

With Input/Output Checks ​

Methods ​

check_input() ​

check_output() ​

before_tool_call() ​

before_request() ​

check_tool_call_limit() ​

record_tool_call() ​

start_execution_timer() / end_execution_timer() ​

create_session() / get_session() ​

link_arms() / link_run_context() ​

begin_run() / end_run() / storage_run_context() ​

Example Usage ​

Basic Text Check ​

Separate Input and Output Checks ​

Input Validation Only ​

Output Validation Only ​

Configuration Options ​

Result Object ​

Next Steps ​

GuardrailSystem

Overview

Initialization

Basic Initialization

With Custom Configuration

From Config File

From GuardrailPolicy File

With Input/Output Checks

Methods

check_input()

check_output()

before_tool_call()

before_request()

check_tool_call_limit()

record_tool_call()

start_execution_timer() / end_execution_timer()

create_session() / get_session()

link_arms() / link_run_context()

begin_run() / end_run() / storage_run_context()

Example Usage

Basic Text Check

Separate Input and Output Checks

Input Validation Only

Output Validation Only

Configuration Options

Result Object

Next Steps