Skip to main content
This page shows how to:
  • Call any LLM with the same API. TensorZero unifies every major LLM API (e.g. OpenAI) and inference server (e.g. Ollama).
  • Get started with a few lines of code. Later, you can optionally add observability, automatic fallbacks, A/B testing, and much more.
  • Use any programming language. You can use TensorZero with its Python SDK, any OpenAI SDK (Python, Node, Go, etc.), or its HTTP API.
We provide complete code examples on GitHub.
  • Python
  • Python (OpenAI SDK)
  • Node (OpenAI SDK)
  • HTTP
The TensorZero Python SDK provides a unified API for calling any LLM.
1

Set up the credentials for your LLM provider

For example, if you’re using OpenAI, you can set the OPENAI_API_KEY environment variable with your API key.
export OPENAI_API_KEY="sk-..."
See the Integrations page to learn how to set up credentials for other LLM providers.
2

Install the TensorZero Python SDK

You can install the TensorZero SDK with a Python package manager like pip.
pip install tensorzero
3

Initialize the TensorZero Gateway

Let’s initialize the TensorZero Gateway. For simplicity, we’ll use an embedded gateway without observability or custom configuration.
from tensorzero import TensorZeroGateway

t0 = TensorZeroGateway.build_embedded()
The TensorZero Python SDK includes a synchronous TensorZeroGateway client and an asynchronous AsyncTensorZeroGateway client. Both options support running the gateway embedded in your application with build_embedded or connecting to a standalone gateway with build_http. See Clients for more details.
4

Call the LLM

response = t0.inference(
    model_name="openai::gpt-5-mini",
    # or: model="anthropic::claude-sonnet-4-20250514"
    # or: Google, AWS, Azure, xAI, vLLM, Ollama, and many more
    input={
        "messages": [
            {
                "role": "user",
                "content": "Tell me a fun fact.",
            }
        ]
    },
)
ChatInferenceResponse(
    inference_id=UUID('0198d339-be77-74e0-b522-e08ec12d3831'),
    episode_id=UUID('0198d339-be77-74e0-b522-e09f578f34d0'),
    variant_name='openai::gpt-5-mini',
    content=[
        Text(
            text='Fun fact: Botanically, bananas are berries but strawberries are not. \n\nA true berry develops from a single ovary and has seeds embedded in the flesh—bananas fit that definition. Strawberries are "aggregate accessory fruits": the tiny seeds on the outside are each from a separate ovary.',
            arguments=None,
            type='text'
        )
    ],
    usage=Usage(input_tokens=12, output_tokens=261),
    finish_reason=FinishReason.STOP,
    original_response=None
)
See the Inference API Reference for more details on the request and response formats.
I