Elsai Model Hub#
The Elsai Model Hub exposes hosted models through an OpenAI-compatible HTTP API. You can use the official OpenAI Python client by pointing base_url at the hub and passing the model id (for example gemma-4, phi-4, or lightonocr).
Note
API keys are not published in this documentation. To obtain credentials for the Models Hub API, contact the DevOps team. Gemma, Phi, and LightOnOCR endpoints may use different keys; request the appropriate secret for the model you use.
Base URL (OpenAI-compatible): https://models-hub-api.elsaifoundry.ai/v1
The Python snippets expect these names in os.environ. Set them in your shell or secret store before you run the examples—for example:
export MODELS_HUB_GEMMA_API_KEY="gemma key"
export MODELS_HUB_PHI_API_KEY="phi mini key"
export MODELS_HUB_LIGHTONOCR_API_KEY="lightonocr key"
Gemma-4 E4B#
Gemma-4 E4B is available on the hub for general chat and multimodal prompts. All examples use the OpenAI Python client with base_url set to the Models Hub and model="gemma-4".
Use model id gemma-4. Set MODELS_HUB_GEMMA_API_KEY to your Gemma key.
Normal chat completion#
Send a single-turn or multi-turn message list and read the assistant text from response.choices[0].message.content.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://models-hub-api.elsaifoundry.ai/v1",
api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)
response = client.chat.completions.create(
model="gemma-4",
messages=[
{
"role": "user",
"content": "What is the capital of France?",
}
],
)
print(response.choices[0].message.content)
Multimodal#
Encode a local image as base64 and send it in the user message with a text prompt, using image_url with a data:image/...;base64,... URL so the model can see the image.
import base64
import os
from openai import OpenAI
def encode_image(image_path):
with open(image_path, "rb") as f:
return base64.b64encode(f.read()).decode("utf-8")
image_base64 = encode_image("test.png")
client = OpenAI(
base_url="https://models-hub-api.elsaifoundry.ai/v1",
api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)
response = client.chat.completions.create(
model="gemma-4",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image? Explain clearly."},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_base64}"
},
},
],
}
],
)
print(response.choices[0].message.content)
Tool calling#
Define tools with JSON schemas, pass them to chat.completions.create with tools=, then if the model returns tool_calls, append a tool result message and call the API again to get the natural-language answer.
import json
import os
from openai import OpenAI
client = OpenAI(
base_url="https://models-hub-api.elsaifoundry.ai/v1",
api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. 'San Francisco, CA'",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit",
},
},
"required": ["location"],
},
},
}
]
response = client.chat.completions.create(
model="gemma-4",
messages=[
{"role": "user", "content": "What is the weather in Tokyo today?"}
],
tools=tools,
max_tokens=1024,
)
message = response.choices[0].message
if message.tool_calls:
tool_call = message.tool_calls[0]
print(f"Tool: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
response = client.chat.completions.create(
model="gemma-4",
messages=[
{"role": "user", "content": "What is the weather in Tokyo today?"},
message,
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(
{
"temperature": 22,
"condition": "Partly cloudy",
"unit": "celsius",
}
),
},
],
tools=tools,
max_tokens=1024,
)
print(f"\nFinal answer: {response.choices[0].message.content}")
Phi-4-mini#
Phi-4-mini is exposed as phi-4 on the hub: a smaller, fast model for everyday chat and tool workflows. Use the same OpenAI-compatible patterns as above with MODELS_HUB_PHI_API_KEY.
Use model id phi-4. Set MODELS_HUB_PHI_API_KEY to your Phi mini key.
Normal chat completion#
Same chat completion flow as Gemma: build messages, call create, and print the first choice’s message.content.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://models-hub-api.elsaifoundry.ai/v1",
api_key=os.environ["MODELS_HUB_PHI_API_KEY"],
)
response = client.chat.completions.create(
model="phi-4",
messages=[
{
"role": "user",
"content": "What is the capital of France?",
}
],
)
print(response.choices[0].message.content)
Tool calling#
Here tool_choice="auto" lets the model decide whether to call a tool; follow the same two-step pattern (tool result, then final completion) as in the Gemma example.
import json
import os
from openai import OpenAI
client = OpenAI(
base_url="https://models-hub-api.elsaifoundry.ai/v1",
api_key=os.environ["MODELS_HUB_PHI_API_KEY"],
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. 'San Francisco, CA'",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit",
},
},
"required": ["location"],
},
},
}
]
response = client.chat.completions.create(
model="phi-4",
messages=[
{"role": "user", "content": "What is the weather in Tokyo today?"}
],
tools=tools,
tool_choice="auto",
max_tokens=1024,
)
message = response.choices[0].message
if message.tool_calls:
tool_call = message.tool_calls[0]
print(f"Tool: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
response = client.chat.completions.create(
model="phi-4",
messages=[
{"role": "user", "content": "What is the weather in Tokyo today?"},
message,
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(
{
"temperature": 22,
"condition": "Partly cloudy",
"unit": "celsius",
}
),
},
],
tools=tools,
max_tokens=1024,
)
print(f"\nFinal answer: {response.choices[0].message.content}")
LightOnOCR#
LightOnOCR is available on the hub for image-to-text extraction. Send a local image as a base64 data:image/... URL together with a text instruction; use model id lightonocr and set MODELS_HUB_LIGHTONOCR_API_KEY to your LightOnOCR key.
Image OCR (chat completion)#
import base64
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["MODELS_HUB_LIGHTONOCR_API_KEY"],
base_url="https://models-hub-api.elsaifoundry.ai/v1",
)
with open("image.png", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="lightonocr",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{b64}"},
},
{
"type": "text",
"text": "Extract all text from this image.",
},
],
}
],
)
print(response.choices[0].message.content)