Elsai Model v1.0.0#
The Elsai Model Package provides a unified interface to connect with multiple LLM providers with enhanced capabilities for v1.0.0:
OpenAI Connector (Synchronous & Asynchronous)
Azure OpenAI Connector (Synchronous & Asynchronous)
AWS Bedrock Connector (Native boto3 implementation)
Anthropic Bedrock Connector (Specialized for Claude models)
Gemini Connector (New)
LITELLM Integration (New)
Note
Streaming Functionality: In version 1.0.1, streaming capabilities have been added to both OpenAI and Azure OpenAI connectors (synchronous and asynchronous). This allows for real-time streaming of responses as they are generated.
Note
Implementation Selection: In version 1.3.0, both OpenAI and Azure OpenAI connectors support an implementation parameter that allows you to choose between “native” (default) and “langchain” implementations. The native implementation returns OpenAI response objects, while the langchain implementation returns LangChain-compatible responses.
Prerequisites#
Python >= 3.9
.env file with appropriate API keys and configuration variables
Installation#
To install the elsai-model package:
pip install --extra-index-url https://elsai-core-package.optisolbusiness.com/root/elsai-model/ elsai-model==1.3.0
Components#
1. OpenAIConnector#
OpenAIConnector is a class designed to establish secure connections with the OpenAI API. It retrieves API credentials from environment variables and provides a method to initialize the connection with a specified language model.
Note
Implementation Selection: In version 1.3.0, you can choose between “native” (default) and “langchain” implementations using the implementation parameter. Native implementation returns OpenAI response objects, while langchain implementation returns LangChain-compatible responses.
from elsai_model.openai import OpenAIConnector
# Native implementation (default) - returns OpenAI response objects
llm_native = OpenAIConnector(
openai_api_key="your_openai_api_key", # or set OPENAI_API_KEY env var
model_name="gpt-4o-mini", # or set OPENAI_MODEL_NAME env var
temperature=0.1,
implementation="native" # Available in v1.3.0+, or omit for default
)
messages = [
{"role": "user", "content": "Hello! Can you tell me a short joke?"}
]
# Native implementation returns OpenAI response object
response = llm_native.invoke(messages=messages)
print(f"Response content: {response.choices[0].message.content}")
# Langchain implementation - returns LangChain-compatible response
llm_langchain = OpenAIConnector(
openai_api_key="your_openai_api_key",
model_name="gpt-4o-mini",
temperature=0.1,
implementation="langchain" # Available in v1.3.0+
)
response_langchain = llm_langchain.invoke(messages=messages)
# Langchain returns dict or string depending on configuration
if hasattr(response_langchain, 'content'):
print(f"Response content: {response_langchain.content}")
elif isinstance(response_langchain, str):
print(f"Response content: {response_langchain}")
# Streaming functionality (available in v1.0.1)
for response in llm_native.stream(messages=messages):
print(response, end='', flush=True)
Required Environment Variables:
OPENAI_API_KEY– Your OpenAI API key for authentication.OPENAI_MODEL_NAME– The name of the model to use (e.g., “gpt-4o-mini”).OPENAI_TEMPERATURE– Temperature value to control the randomness of model outputs.
2. AzureOpenAIConnector#
AzureOpenAIConnector is a class that facilitates connecting to the Azure-hosted OpenAI service. It allows configuration through direct parameters or environment variables, and supports deployment-specific model initialization.
Note
Implementation Selection: In version 1.3.0, you can choose between “native” (default) and “langchain” implementations using the implementation parameter. Native implementation returns OpenAI response objects, while langchain implementation returns LangChain-compatible responses.
from elsai_model.azure_openai import AzureOpenAIConnector
# Native implementation (default) - returns OpenAI response objects
llm_native = AzureOpenAIConnector(
azure_endpoint="https://your-azure-openai-endpoint.openai.azure.com/",
openai_api_key="your-azure-openai-api-key",
openai_api_version="2023-05-15",
deployment_name="gpt-4o-mini",
temperature=0.1,
implementation="native" # Available in v1.3.0+, or omit for default
)
messages = [
{"role": "user", "content": "Hello! Can you tell me a short joke?"}
]
# Native implementation returns OpenAI response object
response = llm_native.invoke(messages=messages)
print(f"Response content: {response.choices[0].message.content}")
# Langchain implementation - returns LangChain-compatible response
llm_langchain = AzureOpenAIConnector(
azure_endpoint="https://your-azure-openai-endpoint.openai.azure.com/",
openai_api_key="your-azure-openai-api-key",
openai_api_version="2023-05-15",
deployment_name="gpt-4o-mini",
temperature=0.1,
implementation="langchain" # Available in v1.3.0+
)
response_langchain = llm_langchain.invoke(messages=messages)
# Langchain returns dict or string depending on configuration
if isinstance(response_langchain, dict) and 'content' in response_langchain:
print(f"Response content: {response_langchain['content']}")
elif hasattr(response_langchain, 'content'):
print(f"Response content: {response_langchain.content}")
# Streaming functionality (available in v1.0.1)
for msg in llm_native.stream(messages=messages):
print(msg, end='', flush=True)
Required Environment Variables:
AZURE_OPENAI_API_KEY– API key used to authenticate with the Azure-hosted OpenAI service.AZURE_OPENAI_ENDPOINT– Endpoint URL of your Azure OpenAI resource.OPENAI_API_VERSION– API version to use when connecting to Azure OpenAI (e.g., 2023-05-15).AZURE_OPENAI_TEMPERATURE– Temperature value to control the randomness of model outputs (e.g., 0.0 for deterministic results).AZURE_OPENAI_DEPLOYMENT_NAME– Name of the specific deployment of the OpenAI model you want to use.
3. BedrockConnector#
BedrockConnector has been migrated from LangChain to a native boto3-based implementation, providing lower latency, fewer dependencies, and more fine-grained control over request/response handling.
Note
Streaming Functionality: In version 1.1.0, streaming capabilities have been added to the BedrockConnector, allowing for real-time streaming of responses as they are generated.
Note
Claude Sonnet 4 Support: In version 1.1.1, the BedrockConnector has been updated to use the new Messages API instead of the deprecated InvokeModel API. This fixes the ValidationException error that occurred when using Claude Sonnet 4 models (e.g., “claude-sonnet-4-20250514”), which no longer support the old InvokeModel operation.
Note
Image Processing with OCR: In version 1.2.0, the BedrockConnector has been enhanced with the invoke_with_image method, allowing for image-based OCR (Optical Character Recognition) processing with Claude models. This enables text extraction from images, handwriting recognition, and structured document transcription.
Note
Response Format Change: In version 1.2.2, the BedrockConnector’s invoke method now returns the entire response object instead of plain text.
from elsai_model.bedrock import BedrockConnector
messages = [{
"role": "user",
"content": "What is the capital of France?"
}]
llm = BedrockConnector(
aws_access_key="your_aws_access_key", # or set AWS_ACCESS_KEY_ID env var
aws_secret_key="your_aws_secret_key", # or set AWS_SECRET_ACCESS_KEY env var
aws_session_token="your_session_token", # or set AWS_SESSION_TOKEN env var
aws_region="us-east-1", # or set AWS_REGION env var
max_tokens=500, # Default is 500
temperature=0.1,
model_id="us.anthropic.claude-3-7-sonnet-20250219-v1:0", # or set BEDROCK_MODEL_ID env var
config=None
)
result = llm.invoke(messages)
# In v1.2.2+, result is the full response object
print("Full Response:", result)
# Streaming functionality (available in v1.1.0)
print("\n=== Streaming Response ===")
for chunk in llm.invoke_model_with_response_stream(messages):
if chunk: # chunk is already a string, not a response object
print(chunk, end='', flush=True)
# Image OCR functionality (available in v1.2.0)
print("\n=== Image OCR ===")
ocr_output = llm.invoke_with_image(
image_path="path/to/your/image.png",
prompt="You are an OCR system. Transcribe all visible text from the image exactly as it appears, including handwriting, crossed-out text, and notes; use `[...]` for illegible parts. Preserve document structure: logical reading order, line breaks, tables, and link labels to values. For checkboxes use [x], [ ], [?]; for radio buttons use (•) and (○); keep all dates, numbers, and formats unchanged. Output only the transcription (no commentary, confidence scores, or extra formatting)."
)
print(ocr_output)
Required Environment Variables:
AWS_ACCESS_KEY_ID– Your AWS access key ID for authenticating with AWS services.AWS_SECRET_ACCESS_KEY– Your AWS secret access key for secure authentication.AWS_SESSION_TOKEN– Temporary session token for secure AWS authentication (used with IAM roles or temporary credentials).AWS_REGION– AWS region (e.g., us-east-1) where the Bedrock service is hosted.BEDROCK_MODEL_ID– The model ID to use (e.g., us.anthropic.claude-3-7-sonnet-20250219-v1:0).BEDROCK_TEMPERATURE– Controls the randomness of the output from the model (optional; default can be set in code).
4. AnthropicBedrockConnector#
AnthropicBedrockConnector is a specialized connector for Anthropic’s Claude models through AWS Bedrock. It provides optimized functionality for Claude models with both regular and streaming capabilities.
Note
Version Requirement: AnthropicBedrockConnector is only available in elsai-model version 1.1.0 and later.
Note
Response Format Change: In version 1.2.2, the AnthropicBedrockConnector’s invoke method now returns the entire response object instead of plain text.
from elsai_model.anthropic_bedrock import AnthropicBedrockConnector
connector = AnthropicBedrockConnector(
model_id="your_anthropic_bedrock_model_id", # or set ANTHROPIC_BEDROCK_MODEL_ID env var
max_tokens=500,
temperature=0.7,
aws_access_key="your_aws_access_key", # or set AWS_ACCESS_KEY_ID env var
aws_secret_key="your_aws_secret_key", # or set AWS_SECRET_ACCESS_KEY env var
aws_region="us-east-1", # or set AWS_REGION env var
aws_session_token="your_session_token" # or set AWS_SESSION_TOKEN env var
)
messages = [
{"role": "user", "content": "Tell me a story about AI"}
]
# Test regular invoke first
print("=== Regular Invoke ===")
response = connector.invoke(messages)
# In v1.2.2+, response is the full response object
print("Full Response:", response)
print("\n=== Streaming Response ===")
# Test streaming - the method returns string chunks directly, not response objects
for chunk in connector.invoke_with_stream(messages):
if chunk: # chunk is already a string, not a response object
print(chunk, end='', flush=True)
Required Environment Variables:
AWS_ACCESS_KEY_ID– Your AWS access key ID for authenticating with AWS services.AWS_SECRET_ACCESS_KEY– Your AWS secret access key for secure authentication.AWS_SESSION_TOKEN– Temporary session token for secure AWS authentication (used with IAM roles or temporary credentials).AWS_REGION– AWS region (e.g., us-east-1) where the Bedrock service is hosted.ANTHROPIC_BEDROCK_MODEL_ID– The Anthropic model ID to use.
Key Features:
Optimized for Claude Models: Specifically designed for Anthropic’s Claude models through AWS Bedrock
Streaming Support: Built-in streaming capabilities for real-time response generation
Flexible Configuration: Supports both environment variables and direct parameter passing
Native AWS Integration: Direct integration with AWS Bedrock service
5. AsyncOpenAIConnector#
AsyncOpenAIConnector is an asynchronous version of the OpenAI connector that provides non-blocking API calls. It’s designed for high-performance applications that need to handle multiple concurrent requests efficiently.
Note
Implementation Selection: In version 1.3.0, you can choose between “native” (default) and “langchain” implementations using the implementation parameter.
from elsai_model.openai import AsyncOpenAIConnector
async def example_openai_usage():
"""Example usage of AsyncOpenAI connector."""
print("=== AsyncOpenAI Example ===")
# Native implementation (default) - returns OpenAI response objects
connector_native = AsyncOpenAIConnector(
openai_api_key="your_openai_api_key", # or set OPENAI_API_KEY env var
model_name="gpt-4o-mini", # or set OPENAI_MODEL_NAME env var
temperature=0.1,
implementation="native" # Available in v1.3.0+, or omit for default
)
messages = [
{"role": "user", "content": "Hello! Can you tell me a short joke?"}
]
try:
# Make the async call - native returns OpenAI response object
response = await connector_native.invoke(messages)
print(f"Response: {response.choices[0].message.content}")
# Langchain implementation
connector_langchain = AsyncOpenAIConnector(
openai_api_key="your_openai_api_key",
model_name="gpt-4o-mini",
temperature=0.1,
implementation="langchain" # Available in v1.3.0+
)
response_langchain = await connector_langchain.invoke(messages)
if hasattr(response_langchain, 'content'):
print(f"Response: {response_langchain.content}")
elif isinstance(response_langchain, str):
print(f"Response: {response_langchain}")
# Streaming functionality (available in v1.0.1)
async for msg in connector_native.stream(messages=messages):
print(msg, end='', flush=True)
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
import asyncio
# Run the async example
asyncio.run(example_openai_usage())
Required Environment Variables:
OPENAI_API_KEY– Your OpenAI API key for authentication.OPENAI_MODEL_NAME– The name of the model to use (e.g., “gpt-4o-mini”).OPENAI_TEMPERATURE– Temperature value to control the randomness of model outputs.
Key Features:
Asynchronous Operations: All API calls are non-blocking and use Python’s async/await syntax
High Performance: Designed for handling multiple concurrent requests efficiently
Error Handling: Comprehensive exception handling for API errors
Flexible Configuration: Supports both environment variables and direct parameter passing
6. AsyncAzureOpenAIConnector#
AsyncAzureOpenAIConnector provides asynchronous capabilities for Azure OpenAI, enabling non-blocking operations and better scalability for high-throughput workloads.
Note
Implementation Selection: In version 1.3.0, you can choose between “native” (default) and “langchain” implementations using the implementation parameter.
from elsai_model.azure_openai import AsyncAzureOpenAIConnector
async def example_azure_openai_usage():
"""Example usage of AsyncAzureOpenAI connector."""
print("\n=== AsyncAzureOpenAI Example ===")
# Native implementation (default) - returns OpenAI response objects
llm_native = AsyncAzureOpenAIConnector(
azure_endpoint="https://your-azure-openai-endpoint.openai.azure.com/", # or set AZURE_OPENAI_ENDPOINT env var
openai_api_key="your-azure-openai-api-key", # or set AZURE_OPENAI_API_KEY env var
openai_api_version="2023-05-15", # or set AZURE_OPENAI_API_VERSION env var
deployment_name="gpt-4o-mini", # or set AZURE_OPENAI_DEPLOYMENT_NAME env var
temperature=0.1,
implementation="native" # Available in v1.3.0+, or omit for default
)
messages = [
{"role": "user", "content": "Hello! Can you explain what async programming is?"}
]
try:
# Make the async call - native returns OpenAI response object
response = await llm_native.invoke(messages)
print(f"Response: {response.choices[0].message.content}")
# Langchain implementation
llm_langchain = AsyncAzureOpenAIConnector(
azure_endpoint="https://your-azure-openai-endpoint.openai.azure.com/",
openai_api_key="your-azure-openai-api-key",
openai_api_version="2023-05-15",
deployment_name="gpt-4o-mini",
temperature=0.1,
implementation="langchain" # Available in v1.3.0+
)
response_langchain = await llm_langchain.invoke(messages)
if isinstance(response_langchain, dict) and 'content' in response_langchain:
print(f"Response: {response_langchain['content']}")
elif hasattr(response_langchain, 'content'):
print(f"Response: {response_langchain.content}")
# Streaming functionality (available in v1.0.1)
async for msg in llm_native.stream(messages=messages):
print(msg, end='', flush=True)
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
import asyncio
# Run the async example
asyncio.run(example_azure_openai_usage())
Required Environment Variables:
AZURE_OPENAI_API_KEY– API key used to authenticate with the Azure-hosted OpenAI service.AZURE_OPENAI_ENDPOINT– Endpoint URL of your Azure OpenAI resource.OPENAI_API_VERSION– API version to use when connecting to Azure OpenAI (e.g., 2023-05-15).AZURE_OPENAI_TEMPERATURE– Temperature value to control the randomness of model outputs (e.g., 0.0 for deterministic results).AZURE_OPENAI_DEPLOYMENT_NAME– Name of the specific deployment of the OpenAI model you want to use.
7. GeminiConnector (New Feature)#
A new Gemini model connector has been added to the package with comprehensive capabilities. The GeminiService class provides multiple ways to interact with Google’s Gemini models.
Note
Response Format Change: In version 1.1.2, the Gemini connector’s generate_text and generate_with_image method now returns the entire response object instead of just the text. This provides access to additional metadata and response details.
Note
Logprobs Support: In version 1.2.1, the Gemini connector has been enhanced with logprobs functionality. The response_logprobs and logprobs parameters have been added to generate_text, stream_text, and generate_with_image methods. This allows you to retrieve token-level log probabilities and top alternative tokens for analysis and debugging.
import os
from elsai_model.gemini import GeminiService
def test_gemini_service():
# Initialize the service with your API key and model
service = GeminiService(
api_key="your_gemini_api_key", # or set GEMINI_API_KEY env var
model="gemini-2.5-flash"
)
# Single-shot text generation
print("=== Test: generate_text ===")
result = service.generate_text("What is Artificial Intelligence?")
print("Response:", result)
# Generate text with logprobs (available in v1.2.1)
print("=== Test: generate_text with logprobs ===")
result_with_logprobs = service.generate_text(
"What is Artificial Intelligence?",
response_logprobs=True,
logprobs=5 # Get top 5 alternative tokens
)
print("Response:", result_with_logprobs)
# Streaming text generation
print("=== Test: stream_text ===")
for chunk in service.stream_text("Tell me a short story about a cat"):
print(chunk, end="", flush=True)
# Streaming text generation with logprobs (available in v1.2.1)
print("\n=== Test: stream_text with logprobs ===")
for chunk in service.stream_text(
"Tell me a short story about a cat",
response_logprobs=True,
logprobs=5 # Get top 5 alternative tokens
):
print(chunk, end="", flush=True)
# Multi-turn chat
print("=== Test: create_chat ===")
chat = service.create_chat()
first_response = chat.send_message("I have 2 dogs in my house.")
print("Chat response 1:", first_response)
second_response = chat.send_message("How many paws are in my house?")
print("Chat response 2:", second_response)
# View chat history
print("=== Chat History ===")
for msg in chat.get_history():
print(f"{msg['role']}: {msg['text']}")
# Multimodal generation (image + text)
print("=== Test: generate_with_image ===")
try:
image_path = "image.png"
from pathlib import Path
if Path(image_path).exists():
img_response = service.generate_with_image(
image_path=image_path,
prompt="Describe this image in detail."
)
print("Image Response:", img_response)
# Multimodal generation with logprobs (available in v1.2.1)
print("=== Test: generate_with_image with logprobs ===")
img_response_with_logprobs = service.generate_with_image(
image_path=image_path,
prompt="Describe this image in detail.",
response_logprobs=True,
logprobs=5 # Get top 5 alternative tokens
)
print("Image Response with Logprobs:", img_response_with_logprobs)
else:
print(f"[SKIPPED] Image not found at path: {image_path}")
except Exception as e:
print(f"Error during generate_with_image test: {e}")
# Streaming chat messages
print("=== Test: chat.send_message_stream ===")
try:
chat_stream = service.create_chat()
print("Streaming response 1:")
for chunk in chat_stream.send_message_stream("Tell me a fun fact about space."):
print(chunk, end="", flush=True)
print("\n")
print("Streaming response 2:")
for chunk in chat_stream.send_message_stream("And another fun fact please."):
print(chunk, end="", flush=True)
print("\n")
print("=== Chat History (streaming chat) ===")
for msg in chat_stream.get_history():
print(f"{msg['role']}: {msg['text']}")
except Exception as e:
print(f"Error during chat.send_message_stream test: {e}")
if __name__ == "__main__":
test_gemini_service()
Required Environment Variables:
GEMINI_API_KEY– Your Gemini API key for authentication.
Key Features:
Single-shot Generation: Send a single prompt and receive a complete response
Streaming Generation: Receive generated text in real-time streaming manner
Multi-turn Chat: Create chat sessions with context retention
Multimodal Support: Process images along with text prompts
Streaming Chat: Stream responses in chat sessions for real-time interaction
8. LITELLM Integration#
LITELLM integration provides a unified interface for multiple LLM providers with consistent API patterns. The LiteLLMConnector class allows you to use different LLM providers with a single, consistent interface.
from elsai_model.litellm import LiteLLMConnector
import os
# Set your API key (you can also use environment variables)
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
# Initialize the connector with model name and parameters
connector = LiteLLMConnector(
model_name="openai/gpt-4o-mini", # Format: provider/model-name
temperature=0.1
)
# Use the consistent API across different providers
response = connector.invoke(messages=[{
"content": "Hello, how are you?",
"role": "user"
}])
print(response)
Model Name Formats:
OpenAI: openai/gpt-4o-mini, openai/gpt-4, openai/gpt-3.5-turbo
Azure: azure/your-deployment-name
Bedrock: bedrock/your-deployment-name
Other Providers: Refer to the LiteLLM documentation for supported providers