Appearance
Architecture
Overview of elsai Guardrails architecture and design.
System Architecture
elsai Guardrails follows a layered architecture:
User Input
↓
Input Rails (Validation)
↓
LLM Processing
↓
Output Rails (Validation)
↓
User Response
↓ (optional)
ARMS Backend StorageComponents
GuardrailSystem
Core component that performs safety checks:
- Toxicity detection
- Sensitive data detection
- Content classification
- Data exfiltration detection (output)
- Agent hooks (tool authorization, rate limiting)
- ARMS storage lifecycle (
link_arms,begin_run,end_run)
LLMRails
High-level component that integrates:
- LLM configuration and invocation
- Input validation
- Output validation
- Result aggregation
- Automatic storage hook wiring when enabled
Configuration System
YAML-based configuration for:
- LLM settings
- Guardrail behavior
- Thresholds and rules
- Storage and exfiltration policies
Data Flow
Input Processing
- User sends input
- Input rails validate content
- If validation fails → Block and return error
- If validation passes → Proceed to LLM
LLM Processing
- Format messages for LLM
- Invoke LLM API
- Receive response
Output Processing
- LLM generates response
- Output rails validate content (toxicity, PII, exfiltration, etc.)
- If validation fails → Block or mask and return error
- If validation passes → Return to user
- When storage is enabled → Buffered events flush to ARMS Backend on
end_run()
Persistence Layer (ARMS Storage)
When guardrails.storage.enabled: true:
GuardrailSystem / LLMRails
↓
GuardrailsStorageHook (in-memory buffer)
↓
BackendGuardrailSink → GET /api/v1/db_type
↓
POST /api/v1/guardrails/{mongodb|dynamodb|clickhouse}
↓
ARMS Backend → configured databaseGuardrail Checks
Toxicity Detection
- Uses remote API service
- Classifies content as toxic/offensive/non-toxic
- Configurable threshold
Sensitive Data Detection
- Uses BERT-based model
- Detects various PII types
- Configurable blocking
- Supports large text processing (v0.1.2+)
PII/PHI Detection and Data Masking
- Uses Microsoft Presidio Analyzer for entity-based detection
- Configurable entity types, confidence thresholds, and policy actions
- Data masking and regex-based PHI pattern detection
- Audit logging with entity type, confidence score, action, session ID, and timestamp
Token Budget Enforcement
- Computes token usage across full request context
- Configurable per-request and per-run limits
- Optional
block_on_exceededfor block vs warn behavior - Rejects or warns on oversized requests before LLM processing
Tool Authorization
- Role-based tool allowlists and global denylists
- Sensitive tool gating with approval metadata
- Enforced via
before_tool_call()agent hooks
Rate Limiting
- Per-session request and tool call quotas
- Cumulative tool execution time limits
- Enforced via
before_request(),check_tool_call_limit(), and session tracking hooks
Data Exfiltration Detection
- Output-only guardrail for credential leaks and bulk data exports
- Three detectors: secrets, bulk sensitive data, abnormal output patterns
- Risk scoring with warn (mask) and block actions
- Integrated with
check_output()andLLMRails.generate()
ARMS Storage
- Persists guardrail run data via the ARMS Backend API
- Automatic routing to MongoDB, DynamoDB, or ClickHouse
- Buffers checks, generate results, tool auth, and rate-limit events per run
- ARMS correlation via
link_arms(),link_run_context(), or environment variables
Content Classification
- Uses semantic routing
- Detects jailbreak, malicious, injection attempts
- Requires an embedding encoder (configurable via ENCODER_TYPE)
LLM Integration
Supports multiple providers through unified interface:
- OpenAI
- Azure OpenAI
- Anthropic
- Gemini
- AWS Bedrock
Configuration
Flexible configuration through:
- YAML files
- YAML strings
- Programmatic configuration
Next Steps
- FAQ - Common questions
- ARMS Storage - Persistence guide
- Data Exfiltration Detection - Output leak prevention