Guardrails Configuration

Configure safety checks and validation rules for your application.

Reference Configuration

The following guardrail policy matches the project config.yml and serves as the canonical reference for all available options:

yaml

# Guardrail policy configuration

guardrails:
  input_checks: true
  output_checks: true

  check_toxicity: true
  check_sensitive_data: true
  check_semantic: true
  toxicity_threshold: 0.7
  block_toxic: true
  block_sensitive_data: true

  # PII/PHI detection policy
  pii:
    enabled: true
    input_checks: true
    output_checks: true
    language: en
    default_confidence_threshold: 0.5
    below_threshold_action: flag
    default_action: flag
    default_mask: true
    enable_phi_detection: true
    entity_types:
      - PERSON
      - LOCATION
      - EMAIL_ADDRESS
      - PHONE_NUMBER
      - CREDIT_CARD
      - NRP
      - MEDICAL_LICENSE
      - US_SSN
      - IBAN_CODE
      - IP_ADDRESS
    entity_thresholds:
      PERSON: 0.7
    entity_policies:
      CREDIT_CARD:
        action: block
        mask: true
      US_SSN:
        action: block
        mask: true
      EMAIL_ADDRESS:
        action: flag
        mask: true
      PHONE_NUMBER:
        action: flag
        mask: true
      PHI_MRN:
        action: review
        mask: true
      PHI_PATIENT_ID:
        action: review
        mask: true

  # Token budget enforcement policy
  token_budget:
    enabled: true
    input_checks: true
    output_checks: true
    max_request_tokens: 50
    max_run_tokens: 80
    reserved_output_tokens: 10
    block_on_exceeded: true

  # Tool authorization policy
  tool_authorization:
    enabled: true
    denied_tools:
      - execute_shell
    sensitive_tools:
      - delete_record
    roles:
      analyst:
        allowed_tools:
          - search_web
          - calculator
      engineer:
        allowed_tools:
          - search_web
          - calculator
          - delete_record

  # Rate limiting policy
  rate_limit:
    enabled: true
    max_requests_per_session: 5
    max_tool_calls_per_session: 50
    max_tool_execution_seconds: 60

  # Data exfiltration detection (output only)
  data_exfiltration:
    enabled: true
    output_checks: true
    action_thresholds:
      warn: 20
      block: 80
    detectors:
      secrets: true
      bulk_sensitive: true
      abnormal_patterns: true

  # ARMS Backend storage (MongoDB / DynamoDB / ClickHouse via Backend)
  storage:
    enabled: true
    project: my-app
    store_raw_text: true
    fail_soft: true
    arms_correlation: true

Configuration Options

Basic Settings

yaml

guardrails:
  input_checks: true    # Enable input validation
  output_checks: true   # Enable output validation

Check Types

yaml

guardrails:
  check_toxicity: true        # Enable toxicity detection
  check_sensitive_data: true  # Enable sensitive data detection
  check_semantic: true        # Enable content classification
  check_off_topic: false      # Enable off-topic detection
  check_sql_syntax: false     # Enable SQL syntax validation

Toxicity Settings

yaml

guardrails:
  check_toxicity: true
  toxicity_threshold: 0.7  # Threshold for blocking (0.0-1.0)
  block_toxic: true        # Block toxic content when detected

Toxicity Threshold: Content with toxicity confidence above this threshold will be blocked if block_toxic is enabled.

Sensitive Data Settings

yaml

guardrails:
  check_sensitive_data: true
  block_sensitive_data: true  # Block content containing sensitive data

Detected sensitive data types include:

Email addresses
Phone numbers
Credit card numbers
Social security numbers
IP addresses
And more...

Content Classification

yaml

guardrails:
  check_semantic: true  # Enable content classification

Content classification detects:

Jailbreak attempts: Attempts to bypass safety restrictions
Malicious content: Requests for harmful activities
Prompt injection: Attempts to inject malicious instructions
Malicious code injection: Code injection attempts

Off-Topic Detection

yaml

guardrails:
  check_off_topic: true
  block_off_topic: true
  allowed_topics:
    - name: "Product Information"
      description: "Questions about product features, specifications, and pricing"
    - name: "Technical Support"
      description: "Help with installation, troubleshooting, and technical issues"

Off-topic detection helps keep conversations focused on allowed subjects. See Off-Topic Detection for details.

SQL Syntax Validation

yaml

guardrails:
  check_sql_syntax: true
  sql_dialect: "mysql"  # postgresql, mysql, sqlserver, sqlite, mongodb, oracle, redshift

SQL syntax validation checks SQL queries for syntax errors. Supported dialects:

postgresql - PostgreSQL
mysql - MySQL/MariaDB
sqlserver - Microsoft SQL Server
sqlite - SQLite
mongodb - MongoDB
oracle - Oracle Database
redshift - Amazon Redshift

See SQL Syntax Validation for details.

PII/PHI Detection and Data Masking

Requires the spaCy model after package installation:

bash

python -m spacy download en_core_web_lg

yaml

guardrails:
  pii:
    enabled: true
    input_checks: true
    output_checks: true
    language: en
    default_confidence_threshold: 0.5
    below_threshold_action: flag
    default_action: flag
    default_mask: true
    enable_phi_detection: true
    entity_types:
      - PERSON
      - LOCATION
      - EMAIL_ADDRESS
      - PHONE_NUMBER
      - CREDIT_CARD
      - NRP
      - MEDICAL_LICENSE
      - US_SSN
      - IBAN_CODE
      - IP_ADDRESS
    entity_thresholds:
      PERSON: 0.7
    entity_policies:
      CREDIT_CARD:
        action: block
        mask: true
      US_SSN:
        action: block
        mask: true
      EMAIL_ADDRESS:
        action: flag
        mask: true
      PHONE_NUMBER:
        action: flag
        mask: true
      PHI_MRN:
        action: review
        mask: true
      PHI_PATIENT_ID:
        action: review
        mask: true

Supported entity types:

Entity Type	Description
`PERSON`	Personal names
`LOCATION`	Geographic locations
`EMAIL_ADDRESS`	Email addresses
`PHONE_NUMBER`	Phone numbers
`CREDIT_CARD`	Credit card numbers
`NRP`	Nationalities, religious, or political groups
`MEDICAL_LICENSE`	Medical license numbers
`US_SSN`	U.S. Social Security numbers
`IBAN_CODE`	International bank account numbers
`IP_ADDRESS`	IP addresses
`PHI_MRN`	Medical record numbers (regex-based PHI detection)
`PHI_PATIENT_ID`	Patient identifiers (regex-based PHI detection)

PII/PHI detection identifies sensitive entities using Microsoft Presidio Analyzer, applies configurable policy actions (flag, block, review, pass), supports data masking, and logs detection events. See PII/PHI Detection for details.

Token Budget Enforcement

yaml

guardrails:
  token_budget:
    enabled: true
    input_checks: true
    output_checks: true
    max_request_tokens: 50
    max_run_tokens: 80
    reserved_output_tokens: 10
    block_on_exceeded: true   # true = block; false = warn only

Token budget enforcement computes token usage across the full request context and rejects or warns on oversized requests. See Token Budget Enforcement for details.

Tool Authorization

yaml

guardrails:
  tool_authorization:
    enabled: true
    denied_tools:
      - execute_shell
    sensitive_tools:
      - delete_record
    roles:
      analyst:
        allowed_tools:
          - search_web
          - calculator
      engineer:
        allowed_tools:
          - search_web
          - calculator
          - delete_record

Tool authorization restricts agent tool access through role-based allowlists and global denylists. Enforced via before_tool_call() hooks in agent frameworks. See Tool Authorization for details.

Rate Limiting

yaml

guardrails:
  rate_limit:
    enabled: true
    max_requests_per_session: 5
    max_tool_calls_per_session: 50
    max_tool_execution_seconds: 60

Rate limiting protects against excessive requests, tool call loops, and runaway execution time. Enforced via session hooks in agent frameworks. See Rate Limiting for details.

Data Exfiltration Detection

Output-only guardrail that scores LLM responses for credential leaks, bulk sensitive data, and export-style payloads.

yaml

guardrails:
  output_checks: true

  data_exfiltration:
    enabled: true
    output_checks: true
    action_thresholds:
      warn: 20
      block: 80
    mask_token: "[REDACTED]"
    detectors:
      secrets: true
      bulk_sensitive: true
      abnormal_patterns: true
    use_detect_secrets_plugin: true
    bulk_sensitive:
      threshold: 20
      score_per_hit: 2
      max_score: 40

Runs on model output only. At the warn threshold, sensitive spans are masked; at the block threshold, the response is rejected. See Data Exfiltration Detection for details.

ARMS Storage

Persist guardrail run data through the ARMS Backend to MongoDB, DynamoDB, or ClickHouse (auto-selected by your deployment).

yaml

guardrails:
  storage:
    enabled: true
    project: my-app
    store_raw_text: true
    fail_soft: true
    unique_run_per_project: false
    arms_correlation: true

Backend credentials are read from API_BASE_URL and ELSAI_ARMS_API_KEY (same as ARMS). Link runs with link_arms() or GUARDRAILS_ARMS_RUN_ID / GUARDRAILS_ARMS_PROJECT_ID. See ARMS Storage for details.

Configuration Reference

Option	Type	Default	Description
`input_checks`	bool	`true`	Enable input validation
`output_checks`	bool	`true`	Enable output validation
`check_toxicity`	bool	`true`	Enable toxicity detection
`check_sensitive_data`	bool	`true`	Enable sensitive data detection
`check_semantic`	bool	`true`	Enable content classification
`check_off_topic`	bool	`false`	Enable off-topic detection
`check_sql_syntax`	bool	`false`	Enable SQL syntax validation
`toxicity_threshold`	float	`0.7`	Threshold for blocking toxic content (0.0-1.0)
`block_toxic`	bool	`true`	Block toxic content
`block_sensitive_data`	bool	`true`	Block sensitive data
`block_off_topic`	bool	`true`	Block off-topic inputs
`allowed_topics`	list	`None`	List of allowed topics (required for off-topic detection)
`sql_dialect`	str	`"mysql"`	SQL dialect for syntax validation
`pii`	dict	—	PII/PHI detection and data masking policy (see below)
`token_budget`	dict	—	Token budget enforcement policy (see below)
`tool_authorization`	dict	—	Tool access control policy (see below)
`rate_limit`	dict	—	Rate limiting and abuse prevention policy (see below)
`data_exfiltration`	dict	—	Output data exfiltration detection policy (see below)
`storage`	dict	—	ARMS Backend persistence policy (see below)

PII/PHI Detection Options

Option	Type	Default	Description
`pii.enabled`	bool	`false`	Enable PII/PHI detection
`pii.input_checks`	bool	`true`	Run detection on user input
`pii.output_checks`	bool	`true`	Run detection on model output
`pii.language`	str	`"en"`	Language code for entity analysis
`pii.default_confidence_threshold`	float	`0.5`	Global minimum confidence for entity recognition
`pii.below_threshold_action`	str	`"flag"`	Action for entities below their threshold
`pii.default_action`	str	`"flag"`	Default action when no entity policy is defined
`pii.default_mask`	bool	`true`	Mask detected values by default
`pii.enable_phi_detection`	bool	`true`	Enable regex-based PHI pattern detection
`pii.entity_types`	list	—	Entity types to detect
`pii.entity_thresholds`	dict	—	Per-entity confidence overrides
`pii.entity_policies`	dict	—	Per-entity action and masking rules

Entity Policy Options

Each key under entity_policies is an entity type name. Each policy supports:

Field	Type	Values	Description
`action`	str	`flag`, `block`, `review`, `pass`	Policy action applied when the entity is detected
`mask`	bool	`true`, `false`	Whether to mask the detected value before downstream processing

Example entity policies from config.yml:

Entity	Action	Mask	Behavior
`CREDIT_CARD`	`block`	`true`	Block request and mask value
`US_SSN`	`block`	`true`	Block request and mask value
`EMAIL_ADDRESS`	`flag`	`true`	Flag detection and mask value
`PHONE_NUMBER`	`flag`	`true`	Flag detection and mask value
`PHI_MRN`	`review`	`true`	Mark for review and mask value
`PHI_PATIENT_ID`	`review`	`true`	Mark for review and mask value

Token Budget Options

Option	Type	Default	Description
`token_budget.enabled`	bool	`false`	Enable token budget enforcement
`token_budget.input_checks`	bool	`true`	Enforce limits on incoming requests
`token_budget.output_checks`	bool	`true`	Enforce limits on model output
`token_budget.max_request_tokens`	int	—	Maximum tokens for a single request context
`token_budget.max_run_tokens`	int	—	Maximum total tokens for an entire run
`token_budget.reserved_output_tokens`	int	—	Tokens reserved for the model response
`token_budget.block_on_exceeded`	bool	`true`	Block when budget exceeded; `false` emits warning only

Tool Authorization Options

Option	Type	Default	Description
`tool_authorization.enabled`	bool	`false`	Enable tool authorization
`tool_authorization.denied_tools`	list	—	Tools blocked for all roles
`tool_authorization.sensitive_tools`	list	—	Tools requiring `metadata={"approved": true}`
`tool_authorization.roles`	dict	—	Role definitions with `allowed_tools` lists

Rate Limiting Options

Option	Type	Default	Description
`rate_limit.enabled`	bool	`false`	Enable rate limiting
`rate_limit.max_requests_per_session`	int	—	Maximum LLM requests per session
`rate_limit.max_tool_calls_per_session`	int	—	Maximum tool invocations per session
`rate_limit.max_tool_execution_seconds`	int	—	Maximum cumulative tool execution time (seconds)

Data Exfiltration Options

Option	Type	Default	Description
`data_exfiltration.enabled`	bool	`false`	Enable output exfiltration detection
`data_exfiltration.output_checks`	bool	inherits `output_checks`	Run on model output
`data_exfiltration.action_thresholds.warn`	int	`20`	Minimum score to mask sensitive spans
`data_exfiltration.action_thresholds.block`	int	`80`	Minimum score to block the response
`data_exfiltration.mask_token`	str	`"[REDACTED]"`	Replacement for masked spans
`data_exfiltration.detectors.secrets`	bool	`true`	Enable secret/credential detection
`data_exfiltration.detectors.bulk_sensitive`	bool	`true`	Enable bulk identifier detection
`data_exfiltration.detectors.abnormal_patterns`	bool	`true`	Enable export-style pattern detection
`data_exfiltration.use_detect_secrets_plugin`	bool	`true`	Use optional `detect-secrets` package
`data_exfiltration.bulk_sensitive.threshold`	int	`20`	Minimum matches to trigger bulk detector

ARMS Storage Options

Option	Type	Default	Description
`storage.enabled`	bool	`false`	Enable persistence via ARMS Backend
`storage.project`	str	`"default"`	Logical project name on run documents
`storage.store_raw_text`	bool	`true`	Store full text; `false` uses SHA-256 digests
`storage.fail_soft`	bool	`true`	Log on write failure instead of raising
`storage.unique_run_per_project`	bool	`false`	Backend upsert behavior for run ids
`storage.arms_correlation`	bool	`true`	Auto-link ARMS `run_id` / `project_id`
`storage.api_base_url`	str	env `API_BASE_URL`	ARMS Backend URL (optional override)
`storage.api_key`	str	env `ELSAI_ARMS_API_KEY`	Backend API key (optional override)
`storage.master_key`	str	env `ARMS_MASTER_KEY`	Optional master key header

Use Cases

Strict Mode

Block all potentially problematic content:

yaml

guardrails:
  input_checks: true
  output_checks: true
  check_toxicity: true
  check_sensitive_data: true
  check_semantic: true
  toxicity_threshold: 0.5  # Lower threshold = more strict
  block_toxic: true
  block_sensitive_data: true
  pii:
    enabled: true
    default_action: block
    default_mask: true

Permissive Mode

Only block clearly problematic content:

yaml

guardrails:
  input_checks: true
  output_checks: true
  check_toxicity: true
  check_sensitive_data: false  # Allow sensitive data
  check_semantic: true
  toxicity_threshold: 0.9  # Higher threshold = more permissive
  block_toxic: true
  block_sensitive_data: false
  pii:
    enabled: true
    default_action: flag
    default_mask: false

Input-Only Mode

Only validate input, not output:

yaml

guardrails:
  input_checks: true
  output_checks: false
  check_toxicity: true
  check_sensitive_data: true
  check_semantic: true
  pii:
    enabled: true
    input_checks: true
    output_checks: false
  token_budget:
    enabled: true
    input_checks: true
    output_checks: false

Next Steps

LLM Configuration - Configure your LLM provider
YAML Configuration - Complete configuration examples

Guardrails Configuration ​

Reference Configuration ​

Configuration Options ​

Basic Settings ​

Check Types ​

Toxicity Settings ​

Sensitive Data Settings ​

Content Classification ​

Off-Topic Detection ​

SQL Syntax Validation ​

PII/PHI Detection and Data Masking ​

Token Budget Enforcement ​

Tool Authorization ​

Rate Limiting ​

Data Exfiltration Detection ​

ARMS Storage ​

Configuration Reference ​

PII/PHI Detection Options ​

Entity Policy Options ​

Token Budget Options ​

Tool Authorization Options ​

Rate Limiting Options ​

Data Exfiltration Options ​

ARMS Storage Options ​

Use Cases ​

Strict Mode ​

Permissive Mode ​

Input-Only Mode ​

Next Steps ​

Guardrails Configuration

Reference Configuration

Configuration Options

Basic Settings

Check Types

Toxicity Settings

Sensitive Data Settings

Content Classification

Off-Topic Detection

SQL Syntax Validation

PII/PHI Detection and Data Masking

Token Budget Enforcement

Tool Authorization

Rate Limiting

Data Exfiltration Detection

ARMS Storage

Configuration Reference

PII/PHI Detection Options

Entity Policy Options

Token Budget Options

Tool Authorization Options

Rate Limiting Options

Data Exfiltration Options

ARMS Storage Options

Use Cases

Strict Mode

Permissive Mode

Input-Only Mode

Next Steps