- Call any LLM with the same API. TensorZero unifies every major LLM API (e.g. OpenAI) and inference server (e.g. Ollama).
- Get started with a few lines of code. Later, you can optionally add observability, automatic fallbacks, A/B testing, and much more.
- Use any programming language. You can use TensorZero with its Python SDK, any OpenAI SDK (Python, Node, Go, etc.), or its HTTP API.
We provide complete code examples on GitHub.
- Python
- Python (OpenAI SDK)
- Node (OpenAI SDK)
- HTTP
The TensorZero Python SDK provides a unified API for calling any LLM.
1
Set up the credentials for your LLM provider
For example, if you’re using OpenAI, you can set the
OPENAI_API_KEY
environment variable with your API key.See the Integrations page to learn how to set up credentials for other LLM providers.
2
Install the TensorZero Python SDK
You can install the TensorZero SDK with a Python package manager like
pip
.3
Initialize the TensorZero Gateway
Let’s initialize the TensorZero Gateway.
For simplicity, we’ll use an embedded gateway without observability or custom configuration.
The TensorZero Python SDK includes a synchronous
TensorZeroGateway
client and an asynchronous AsyncTensorZeroGateway
client.
Both options support running the gateway embedded in your application with build_embedded
or connecting to a standalone gateway with build_http
.
See Clients for more details.4
Call the LLM
Sample Response
Sample Response
See the Inference API Reference for more details on the request and response formats.