Appearance
Guardrails Configuration
Configure safety checks and validation rules for your application.
Reference Configuration
The following guardrail policy matches the project config.yml and serves as the canonical reference for all available options:
yaml
# Guardrail policy configuration
guardrails:
input_checks: true
output_checks: true
check_toxicity: true
check_sensitive_data: true
check_semantic: true
toxicity_threshold: 0.7
block_toxic: true
block_sensitive_data: true
# PII/PHI detection policy
pii:
enabled: true
input_checks: true
output_checks: true
language: en
default_confidence_threshold: 0.5
below_threshold_action: flag
default_action: flag
default_mask: true
enable_phi_detection: true
entity_types:
- PERSON
- LOCATION
- EMAIL_ADDRESS
- PHONE_NUMBER
- CREDIT_CARD
- NRP
- MEDICAL_LICENSE
- US_SSN
- IBAN_CODE
- IP_ADDRESS
entity_thresholds:
PERSON: 0.7
entity_policies:
CREDIT_CARD:
action: block
mask: true
US_SSN:
action: block
mask: true
EMAIL_ADDRESS:
action: flag
mask: true
PHONE_NUMBER:
action: flag
mask: true
PHI_MRN:
action: review
mask: true
PHI_PATIENT_ID:
action: review
mask: true
# Token budget enforcement policy
token_budget:
enabled: true
input_checks: true
output_checks: true
max_request_tokens: 50
max_run_tokens: 80
reserved_output_tokens: 10
block_on_exceeded: true
# Tool authorization policy
tool_authorization:
enabled: true
denied_tools:
- execute_shell
sensitive_tools:
- delete_record
roles:
analyst:
allowed_tools:
- search_web
- calculator
engineer:
allowed_tools:
- search_web
- calculator
- delete_record
# Rate limiting policy
rate_limit:
enabled: true
max_requests_per_session: 5
max_tool_calls_per_session: 50
max_tool_execution_seconds: 60
# Data exfiltration detection (output only)
data_exfiltration:
enabled: true
output_checks: true
action_thresholds:
warn: 20
block: 80
detectors:
secrets: true
bulk_sensitive: true
abnormal_patterns: true
# ARMS Backend storage (MongoDB / DynamoDB / ClickHouse via Backend)
storage:
enabled: true
project: my-app
store_raw_text: true
fail_soft: true
arms_correlation: trueConfiguration Options
Basic Settings
yaml
guardrails:
input_checks: true # Enable input validation
output_checks: true # Enable output validationCheck Types
yaml
guardrails:
check_toxicity: true # Enable toxicity detection
check_sensitive_data: true # Enable sensitive data detection
check_semantic: true # Enable content classification
check_off_topic: false # Enable off-topic detection
check_sql_syntax: false # Enable SQL syntax validationToxicity Settings
yaml
guardrails:
check_toxicity: true
toxicity_threshold: 0.7 # Threshold for blocking (0.0-1.0)
block_toxic: true # Block toxic content when detectedToxicity Threshold: Content with toxicity confidence above this threshold will be blocked if block_toxic is enabled.
Sensitive Data Settings
yaml
guardrails:
check_sensitive_data: true
block_sensitive_data: true # Block content containing sensitive dataDetected sensitive data types include:
- Email addresses
- Phone numbers
- Credit card numbers
- Social security numbers
- IP addresses
- And more...
Content Classification
yaml
guardrails:
check_semantic: true # Enable content classificationContent classification detects:
- Jailbreak attempts: Attempts to bypass safety restrictions
- Malicious content: Requests for harmful activities
- Prompt injection: Attempts to inject malicious instructions
- Malicious code injection: Code injection attempts
Off-Topic Detection
yaml
guardrails:
check_off_topic: true
block_off_topic: true
allowed_topics:
- name: "Product Information"
description: "Questions about product features, specifications, and pricing"
- name: "Technical Support"
description: "Help with installation, troubleshooting, and technical issues"Off-topic detection helps keep conversations focused on allowed subjects. See Off-Topic Detection for details.
SQL Syntax Validation
yaml
guardrails:
check_sql_syntax: true
sql_dialect: "mysql" # postgresql, mysql, sqlserver, sqlite, mongodb, oracle, redshiftSQL syntax validation checks SQL queries for syntax errors. Supported dialects:
postgresql- PostgreSQLmysql- MySQL/MariaDBsqlserver- Microsoft SQL Serversqlite- SQLitemongodb- MongoDBoracle- Oracle Databaseredshift- Amazon Redshift
See SQL Syntax Validation for details.
PII/PHI Detection and Data Masking
Requires the spaCy model after package installation:
bash
python -m spacy download en_core_web_lgyaml
guardrails:
pii:
enabled: true
input_checks: true
output_checks: true
language: en
default_confidence_threshold: 0.5
below_threshold_action: flag
default_action: flag
default_mask: true
enable_phi_detection: true
entity_types:
- PERSON
- LOCATION
- EMAIL_ADDRESS
- PHONE_NUMBER
- CREDIT_CARD
- NRP
- MEDICAL_LICENSE
- US_SSN
- IBAN_CODE
- IP_ADDRESS
entity_thresholds:
PERSON: 0.7
entity_policies:
CREDIT_CARD:
action: block
mask: true
US_SSN:
action: block
mask: true
EMAIL_ADDRESS:
action: flag
mask: true
PHONE_NUMBER:
action: flag
mask: true
PHI_MRN:
action: review
mask: true
PHI_PATIENT_ID:
action: review
mask: trueSupported entity types:
| Entity Type | Description |
|---|---|
PERSON | Personal names |
LOCATION | Geographic locations |
EMAIL_ADDRESS | Email addresses |
PHONE_NUMBER | Phone numbers |
CREDIT_CARD | Credit card numbers |
NRP | Nationalities, religious, or political groups |
MEDICAL_LICENSE | Medical license numbers |
US_SSN | U.S. Social Security numbers |
IBAN_CODE | International bank account numbers |
IP_ADDRESS | IP addresses |
PHI_MRN | Medical record numbers (regex-based PHI detection) |
PHI_PATIENT_ID | Patient identifiers (regex-based PHI detection) |
PII/PHI detection identifies sensitive entities using Microsoft Presidio Analyzer, applies configurable policy actions (flag, block, review, pass), supports data masking, and logs detection events. See PII/PHI Detection for details.
Token Budget Enforcement
yaml
guardrails:
token_budget:
enabled: true
input_checks: true
output_checks: true
max_request_tokens: 50
max_run_tokens: 80
reserved_output_tokens: 10
block_on_exceeded: true # true = block; false = warn onlyToken budget enforcement computes token usage across the full request context and rejects or warns on oversized requests. See Token Budget Enforcement for details.
Tool Authorization
yaml
guardrails:
tool_authorization:
enabled: true
denied_tools:
- execute_shell
sensitive_tools:
- delete_record
roles:
analyst:
allowed_tools:
- search_web
- calculator
engineer:
allowed_tools:
- search_web
- calculator
- delete_recordTool authorization restricts agent tool access through role-based allowlists and global denylists. Enforced via before_tool_call() hooks in agent frameworks. See Tool Authorization for details.
Rate Limiting
yaml
guardrails:
rate_limit:
enabled: true
max_requests_per_session: 5
max_tool_calls_per_session: 50
max_tool_execution_seconds: 60Rate limiting protects against excessive requests, tool call loops, and runaway execution time. Enforced via session hooks in agent frameworks. See Rate Limiting for details.
Data Exfiltration Detection
Output-only guardrail that scores LLM responses for credential leaks, bulk sensitive data, and export-style payloads.
yaml
guardrails:
output_checks: true
data_exfiltration:
enabled: true
output_checks: true
action_thresholds:
warn: 20
block: 80
mask_token: "[REDACTED]"
detectors:
secrets: true
bulk_sensitive: true
abnormal_patterns: true
use_detect_secrets_plugin: true
bulk_sensitive:
threshold: 20
score_per_hit: 2
max_score: 40Runs on model output only. At the warn threshold, sensitive spans are masked; at the block threshold, the response is rejected. See Data Exfiltration Detection for details.
ARMS Storage
Persist guardrail run data through the ARMS Backend to MongoDB, DynamoDB, or ClickHouse (auto-selected by your deployment).
yaml
guardrails:
storage:
enabled: true
project: my-app
store_raw_text: true
fail_soft: true
unique_run_per_project: false
arms_correlation: trueBackend credentials are read from API_BASE_URL and ELSAI_ARMS_API_KEY (same as ARMS). Link runs with link_arms() or GUARDRAILS_ARMS_RUN_ID / GUARDRAILS_ARMS_PROJECT_ID. See ARMS Storage for details.
Configuration Reference
| Option | Type | Default | Description |
|---|---|---|---|
input_checks | bool | true | Enable input validation |
output_checks | bool | true | Enable output validation |
check_toxicity | bool | true | Enable toxicity detection |
check_sensitive_data | bool | true | Enable sensitive data detection |
check_semantic | bool | true | Enable content classification |
check_off_topic | bool | false | Enable off-topic detection |
check_sql_syntax | bool | false | Enable SQL syntax validation |
toxicity_threshold | float | 0.7 | Threshold for blocking toxic content (0.0-1.0) |
block_toxic | bool | true | Block toxic content |
block_sensitive_data | bool | true | Block sensitive data |
block_off_topic | bool | true | Block off-topic inputs |
allowed_topics | list | None | List of allowed topics (required for off-topic detection) |
sql_dialect | str | "mysql" | SQL dialect for syntax validation |
pii | dict | — | PII/PHI detection and data masking policy (see below) |
token_budget | dict | — | Token budget enforcement policy (see below) |
tool_authorization | dict | — | Tool access control policy (see below) |
rate_limit | dict | — | Rate limiting and abuse prevention policy (see below) |
data_exfiltration | dict | — | Output data exfiltration detection policy (see below) |
storage | dict | — | ARMS Backend persistence policy (see below) |
PII/PHI Detection Options
| Option | Type | Default | Description |
|---|---|---|---|
pii.enabled | bool | false | Enable PII/PHI detection |
pii.input_checks | bool | true | Run detection on user input |
pii.output_checks | bool | true | Run detection on model output |
pii.language | str | "en" | Language code for entity analysis |
pii.default_confidence_threshold | float | 0.5 | Global minimum confidence for entity recognition |
pii.below_threshold_action | str | "flag" | Action for entities below their threshold |
pii.default_action | str | "flag" | Default action when no entity policy is defined |
pii.default_mask | bool | true | Mask detected values by default |
pii.enable_phi_detection | bool | true | Enable regex-based PHI pattern detection |
pii.entity_types | list | — | Entity types to detect |
pii.entity_thresholds | dict | — | Per-entity confidence overrides |
pii.entity_policies | dict | — | Per-entity action and masking rules |
Entity Policy Options
Each key under entity_policies is an entity type name. Each policy supports:
| Field | Type | Values | Description |
|---|---|---|---|
action | str | flag, block, review, pass | Policy action applied when the entity is detected |
mask | bool | true, false | Whether to mask the detected value before downstream processing |
Example entity policies from config.yml:
| Entity | Action | Mask | Behavior |
|---|---|---|---|
CREDIT_CARD | block | true | Block request and mask value |
US_SSN | block | true | Block request and mask value |
EMAIL_ADDRESS | flag | true | Flag detection and mask value |
PHONE_NUMBER | flag | true | Flag detection and mask value |
PHI_MRN | review | true | Mark for review and mask value |
PHI_PATIENT_ID | review | true | Mark for review and mask value |
Token Budget Options
| Option | Type | Default | Description |
|---|---|---|---|
token_budget.enabled | bool | false | Enable token budget enforcement |
token_budget.input_checks | bool | true | Enforce limits on incoming requests |
token_budget.output_checks | bool | true | Enforce limits on model output |
token_budget.max_request_tokens | int | — | Maximum tokens for a single request context |
token_budget.max_run_tokens | int | — | Maximum total tokens for an entire run |
token_budget.reserved_output_tokens | int | — | Tokens reserved for the model response |
token_budget.block_on_exceeded | bool | true | Block when budget exceeded; false emits warning only |
Tool Authorization Options
| Option | Type | Default | Description |
|---|---|---|---|
tool_authorization.enabled | bool | false | Enable tool authorization |
tool_authorization.denied_tools | list | — | Tools blocked for all roles |
tool_authorization.sensitive_tools | list | — | Tools requiring metadata={"approved": true} |
tool_authorization.roles | dict | — | Role definitions with allowed_tools lists |
Rate Limiting Options
| Option | Type | Default | Description |
|---|---|---|---|
rate_limit.enabled | bool | false | Enable rate limiting |
rate_limit.max_requests_per_session | int | — | Maximum LLM requests per session |
rate_limit.max_tool_calls_per_session | int | — | Maximum tool invocations per session |
rate_limit.max_tool_execution_seconds | int | — | Maximum cumulative tool execution time (seconds) |
Data Exfiltration Options
| Option | Type | Default | Description |
|---|---|---|---|
data_exfiltration.enabled | bool | false | Enable output exfiltration detection |
data_exfiltration.output_checks | bool | inherits output_checks | Run on model output |
data_exfiltration.action_thresholds.warn | int | 20 | Minimum score to mask sensitive spans |
data_exfiltration.action_thresholds.block | int | 80 | Minimum score to block the response |
data_exfiltration.mask_token | str | "[REDACTED]" | Replacement for masked spans |
data_exfiltration.detectors.secrets | bool | true | Enable secret/credential detection |
data_exfiltration.detectors.bulk_sensitive | bool | true | Enable bulk identifier detection |
data_exfiltration.detectors.abnormal_patterns | bool | true | Enable export-style pattern detection |
data_exfiltration.use_detect_secrets_plugin | bool | true | Use optional detect-secrets package |
data_exfiltration.bulk_sensitive.threshold | int | 20 | Minimum matches to trigger bulk detector |
ARMS Storage Options
| Option | Type | Default | Description |
|---|---|---|---|
storage.enabled | bool | false | Enable persistence via ARMS Backend |
storage.project | str | "default" | Logical project name on run documents |
storage.store_raw_text | bool | true | Store full text; false uses SHA-256 digests |
storage.fail_soft | bool | true | Log on write failure instead of raising |
storage.unique_run_per_project | bool | false | Backend upsert behavior for run ids |
storage.arms_correlation | bool | true | Auto-link ARMS run_id / project_id |
storage.api_base_url | str | env API_BASE_URL | ARMS Backend URL (optional override) |
storage.api_key | str | env ELSAI_ARMS_API_KEY | Backend API key (optional override) |
storage.master_key | str | env ARMS_MASTER_KEY | Optional master key header |
Use Cases
Strict Mode
Block all potentially problematic content:
yaml
guardrails:
input_checks: true
output_checks: true
check_toxicity: true
check_sensitive_data: true
check_semantic: true
toxicity_threshold: 0.5 # Lower threshold = more strict
block_toxic: true
block_sensitive_data: true
pii:
enabled: true
default_action: block
default_mask: truePermissive Mode
Only block clearly problematic content:
yaml
guardrails:
input_checks: true
output_checks: true
check_toxicity: true
check_sensitive_data: false # Allow sensitive data
check_semantic: true
toxicity_threshold: 0.9 # Higher threshold = more permissive
block_toxic: true
block_sensitive_data: false
pii:
enabled: true
default_action: flag
default_mask: falseInput-Only Mode
Only validate input, not output:
yaml
guardrails:
input_checks: true
output_checks: false
check_toxicity: true
check_sensitive_data: true
check_semantic: true
pii:
enabled: true
input_checks: true
output_checks: false
token_budget:
enabled: true
input_checks: true
output_checks: falseNext Steps
- LLM Configuration - Configure your LLM provider
- YAML Configuration - Complete configuration examples