Elsai Model Hub#

The Elsai Model Hub exposes hosted models through an OpenAI-compatible HTTP API. You can use the official OpenAI Python client by pointing base_url at the hub and passing the model id (for example gemma-4, phi-4, or lightonocr).

Note

API keys are not published in this documentation. To obtain credentials for the Models Hub API, contact the DevOps team. Gemma, Phi, and LightOnOCR endpoints may use different keys; request the appropriate secret for the model you use.

Base URL (OpenAI-compatible): https://models-hub-api.elsaifoundry.ai/v1

The Python snippets expect these names in os.environ. Set them in your shell or secret store before you run the examples—for example:

export MODELS_HUB_GEMMA_API_KEY="gemma key"
export MODELS_HUB_PHI_API_KEY="phi mini key"
export MODELS_HUB_LIGHTONOCR_API_KEY="lightonocr key"

Gemma-4 E4B#

Gemma-4 E4B is available on the hub for general chat and multimodal prompts. All examples use the OpenAI Python client with base_url set to the Models Hub and model="gemma-4".

Use model id gemma-4. Set MODELS_HUB_GEMMA_API_KEY to your Gemma key.

Normal chat completion#

Send a single-turn or multi-turn message list and read the assistant text from response.choices[0].message.content.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
    api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)

response = client.chat.completions.create(
    model="gemma-4",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?",
        }
    ],
)

print(response.choices[0].message.content)

Multimodal#

Encode a local image as base64 and send it in the user message with a text prompt, using image_url with a data:image/...;base64,... URL so the model can see the image.

import base64
import os
from openai import OpenAI

def encode_image(image_path):
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode("utf-8")

image_base64 = encode_image("test.png")

client = OpenAI(
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
    api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)

response = client.chat.completions.create(
    model="gemma-4",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image? Explain clearly."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image_base64}"
                    },
                },
            ],
        }
    ],
)

print(response.choices[0].message.content)

Tool calling#

Define tools with JSON schemas, pass them to chat.completions.create with tools=, then if the model returns tool_calls, append a tool result message and call the API again to get the natural-language answer.

import json
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
    api_key=os.environ["MODELS_HUB_GEMMA_API_KEY"],
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. 'San Francisco, CA'",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit",
                    },
                },
                "required": ["location"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="gemma-4",
    messages=[
        {"role": "user", "content": "What is the weather in Tokyo today?"}
    ],
    tools=tools,
    max_tokens=1024,
)

message = response.choices[0].message

if message.tool_calls:
    tool_call = message.tool_calls[0]
    print(f"Tool: {tool_call.function.name}")
    print(f"Args: {tool_call.function.arguments}")

    response = client.chat.completions.create(
        model="gemma-4",
        messages=[
            {"role": "user", "content": "What is the weather in Tokyo today?"},
            message,
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(
                    {
                        "temperature": 22,
                        "condition": "Partly cloudy",
                        "unit": "celsius",
                    }
                ),
            },
        ],
        tools=tools,
        max_tokens=1024,
    )

    print(f"\nFinal answer: {response.choices[0].message.content}")

Phi-4-mini#

Phi-4-mini is exposed as phi-4 on the hub: a smaller, fast model for everyday chat and tool workflows. Use the same OpenAI-compatible patterns as above with MODELS_HUB_PHI_API_KEY.

Use model id phi-4. Set MODELS_HUB_PHI_API_KEY to your Phi mini key.

Normal chat completion#

Same chat completion flow as Gemma: build messages, call create, and print the first choice’s message.content.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
    api_key=os.environ["MODELS_HUB_PHI_API_KEY"],
)

response = client.chat.completions.create(
    model="phi-4",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?",
        }
    ],
)

print(response.choices[0].message.content)

Tool calling#

Here tool_choice="auto" lets the model decide whether to call a tool; follow the same two-step pattern (tool result, then final completion) as in the Gemma example.

import json
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
    api_key=os.environ["MODELS_HUB_PHI_API_KEY"],
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. 'San Francisco, CA'",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit",
                    },
                },
                "required": ["location"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="phi-4",
    messages=[
        {"role": "user", "content": "What is the weather in Tokyo today?"}
    ],
    tools=tools,
    tool_choice="auto",
    max_tokens=1024,
)

message = response.choices[0].message

if message.tool_calls:
    tool_call = message.tool_calls[0]
    print(f"Tool: {tool_call.function.name}")
    print(f"Args: {tool_call.function.arguments}")

    response = client.chat.completions.create(
        model="phi-4",
        messages=[
            {"role": "user", "content": "What is the weather in Tokyo today?"},
            message,
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(
                    {
                        "temperature": 22,
                        "condition": "Partly cloudy",
                        "unit": "celsius",
                    }
                ),
            },
        ],
        tools=tools,
        max_tokens=1024,
    )

    print(f"\nFinal answer: {response.choices[0].message.content}")

LightOnOCR#

LightOnOCR is available on the hub for image-to-text extraction. Send a local image as a base64 data:image/... URL together with a text instruction; use model id lightonocr and set MODELS_HUB_LIGHTONOCR_API_KEY to your LightOnOCR key.

Image OCR (chat completion)#

import base64
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["MODELS_HUB_LIGHTONOCR_API_KEY"],
    base_url="https://models-hub-api.elsaifoundry.ai/v1",
)

with open("image.png", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="lightonocr",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{b64}"},
                },
                {
                    "type": "text",
                    "text": "Extract all text from this image.",
                },
            ],
        }
    ],
)

print(response.choices[0].message.content)