User Guide

Complete guide to implementing and using elsai ARMS SaaS in your AI projects

Getting Started

Access arms.elsaifoundry.ai and create your account.

Create or join an organization

Set up your team workspace and generate an API key.

Implement ARMS

Add monitoring decorators and callbacks to your application code (see sections below).

View the dashboard

Monitor runs and traces in the cloud dashboard.

Sample code

Tutorial — end-to-end examples
Agent Monitoring — LangChain and LangGraph callback guide

Core Functionalities

Import and Initialization

Import the Library

python

from elsai_arms.elsai_arms import ElsaiARMS

Initialize ARMS

python

arms = ElsaiARMS('project_name')

This sets up the monitoring environment for a new or existing project. It will:

Check if the project exists
Create a new project if needed
Load the latest project session data

LLM Monitoring

LLM Interaction Monitoring

Capture comprehensive metrics for each LLM call including latency, token usage, cost, and governance metrics.

Basic Metrics

ModelLLM ProviderInput TokensOutput TokensTotal Tokens

Performance Metrics

Latency (ms)Tokens per SecondOutput ThroughputTotal Throughput

Content and Cost

PromptResponseRelevance ScoreCost

Governance and Safety Metrics

Content Safety

Hate Speech: Risk assessment (Low/Medium/High)
Violence: Risk assessment (Low/Medium/High)
Overall Safety: Comprehensive rating

Prompt Security

Injection Detection: Prompt manipulation risk
Integrity Check: Prompt validation status

Quality Assessment

Response Quality: Overall assessment score
Relevance: Query-response alignment

Implementation Example

python

@arms.monitor_llm_call
def get_response(prompt: str):
    return llm.invoke(prompt)

Streaming LLM Monitoring

For async streaming LLM calls, use the monitor_llm_astream decorator:

python

from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI

async def main():
    @arms.monitor_llm_astream
    async def run_astream(prompt, llm):
        return llm.astream_events([HumanMessage(content=prompt)])

    llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)

    async for event in run_astream("Explain quantum computing in simple terms.", llm=llm):
        if event["event"] == "on_chat_model_stream":
            print(event["data"]["chunk"].content, end="", flush=True)

OCR Monitoring

OCR Operation Monitoring

Track text extraction performance, confidence scores, and processing efficiency across multiple OCR models.

Operation Metrics

Model NameText LengthConfidence Score

Performance Metrics

CostTimestampDuration

Supported OCR Models

EasyOCROpen Source

AWS TextractCloud Service

Azure Document IntelligenceCloud Service

Azure Computer VisionCloud Service

Google Vision AICloud Service

PaddleOCROpen Source

TesseractOpen Source

Vision LLMAI-Powered

Implementation Example

python

@arms.monitor_ocr_call("OCR_name")
def extract_text(image_path: str):
    return ocr_model.extract(image_path)

RAG Monitoring

RAG Operation Monitoring

Track document retrieval, query processing, and relevance scoring for your RAG systems.

Query Metrics

Function NameQueryQuery Length

Retrieval Metrics

Documents CountResult CountRelevance Score

Performance Metrics

TimestampStatusLatencyCost

Error Handling

Error DetailsOperation Type

Implementation Example

python

@arms.monitor_rag_call
def retrieve_documents(query: str):
    return rag_system.search(query)

Embedding Monitoring

Embedding Operation Monitoring

Monitor vector generation, processing performance, and cost efficiency.

Input Metrics

Function NameInput LengthDimensions

Performance Metrics

TimestampLatencyCost

Error Handling

Error DetailsOperation Type

Implementation Example

python

@arms.monitor_embedding_call
def get_embedding(text: str):
    return embedding_model.encode(text)

Agent Monitoring

SaaS supports LangChain and LangGraph agent tracing via arms.langchain_callback. See the Agent Monitoring guide for a full walkthrough.

LangChain Agent Monitoring

Monitor LangChain agents and graphs by adding arms.langchain_callback to track agent execution, tool usage, and overall performance.

Agent Information

Agent NameTool CallsExecution Steps

Performance Metrics

TimestampStatusLatencyTotal Tokens

LLM Interactions

LLM CallsToken UsageCost

python

from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph

graph = StateGraph(...)

messages = [HumanMessage(content="Your query here")]
result = graph.invoke(
    {"messages": messages},
    config={"callbacks": [arms.langchain_callback]}
)

Additional Features

Model Resolution

elsai ARMS implements a dual-stage model resolution pipeline. If an explicit model name is not provided, the system automatically switches to auto-extraction to ensure accurate telemetry and cost tracking.

Step 1: Explicit Override

Manually specify the model name to ensure consistent naming. This takes the highest priority and bypasses automatic extraction.

For Direct LLM Calls

python

@arms.monitor_llm_call(model="gpt-4o")
def get_response(prompt: str):
    return llm.invoke(prompt)

For Agents and LangChain

python

config = {
    "callbacks": [arms.langchain_callback],
    "metadata": {"arms_override_model": "gpt-4o"}
}
result = agent.invoke({"input": query}, config=config)

Step 2: Auto-Extraction

If no override is provided in Step 1, ARMS automatically switches to extracting model details from raw API responses or execution contexts. This ensures monitoring remains active even without manual configuration.

Custom Metrics

Track custom business metrics and internal KPIs as key-value pairs.

python

arms.log_custom_metric('Metric Name', metric_value)

Built-in Logging

elsai ARMS provides built-in logging for tracking important events and errors during project execution.

Info Logs

python

arms.info('Log Operation')

Warning Logs

python

arms.warning('Log Warning')

Error Logs

python

arms.error('Log Error')

Data Export

Export comprehensive project data for analysis, reporting, or integration with external systems.

python

arms.export()

Session Management

Finalize and complete your monitoring session for successful project runs.

python

arms.end_run()

View completed runs in the cloud dashboard.

User Guide ​

Getting Started ​

Core Functionalities ​

Import and Initialization ​

Import the Library ​

Initialize ARMS ​

LLM Monitoring ​

LLM Interaction Monitoring

Basic Metrics

Performance Metrics

Content and Cost

Governance and Safety Metrics

Content Safety

Prompt Security

Quality Assessment

Implementation Example ​

Streaming LLM Monitoring ​

OCR Monitoring ​

OCR Operation Monitoring

Operation Metrics

Performance Metrics

Supported OCR Models ​

Implementation Example ​

RAG Monitoring ​

RAG Operation Monitoring

Query Metrics

Retrieval Metrics

Performance Metrics

Error Handling

Implementation Example ​

Embedding Monitoring ​

Embedding Operation Monitoring

Input Metrics

Performance Metrics

Error Handling

Implementation Example ​

Agent Monitoring ​

LangChain Agent Monitoring

Agent Information

Performance Metrics

LLM Interactions

Additional Features ​

Model Resolution ​

Custom Metrics ​

Built-in Logging ​

Info Logs ​

Warning Logs ​

Error Logs ​

Data Export ​

Session Management ​

User Guide

Getting Started

Core Functionalities

Import and Initialization

Import the Library

Initialize ARMS

LLM Monitoring

Implementation Example

Streaming LLM Monitoring

OCR Monitoring

Supported OCR Models

Implementation Example

RAG Monitoring

Implementation Example

Embedding Monitoring

Implementation Example

Agent Monitoring

Additional Features

Model Resolution

Custom Metrics

Built-in Logging

Info Logs

Warning Logs

Error Logs

Data Export

Session Management