TensorZero Autopilot is an automated AI engineer that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests. Schedule a demo →
chat: the default choice for most LLM chat completion use cases
json: a specialized function type when your goal is generating structured outputs
As a rule of thumb, you should use JSON functions if you have a single, well-defined output schema.
If you need more flexibility (e.g. letting the model pick between multiple tools, or whether to pick a tool at all), then Chat Functions with tool use might be a better fit.
Create a configuration file that defines your JSON function with the output schema and JSON mode.
If you don’t specify an output_schema, the gateway will default to accepting any valid JSON output.
The field json_mode can be one of the following: off, on, strict, or tool.
The tool strategy is a custom TensorZero implementation that leverages tool use under the hood for generating JSON.
See Configuration Reference for details.
Use "strict" mode for providers that support it (e.g. OpenAI) or "tool" for others.
2
Configure your output schema
If you choose to specify a schema, place it in the relevant file:
Create a template that instructs the model to extract the information you need.
system_template.minijinja
You are a helpful AI assistant that extracts customer information from messages.Extract the customer's name and email address if present. Use null for any fields that are not found.Your output should be a JSON object with the following schema:{ "name": string or null, "email": string or null}---Examples:User: Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.comAssistant: {"name": "Sarah Johnson", "email": "sarah.j@example.com"}User: My email is contact@company.comAssistant: {"name": null, "email": "contact@company.com"}User: This is John Doe reaching outAssistant: {"name": "John Doe", "email": null}
Including examples in your prompt helps the model understand the expected output format and improves accuracy.
4
Call the function
Python
Node
HTTP
You can point the OpenAI Python SDK to a TensorZero Gateway to generate structured outputs.
The response content is the JSON string generated by the model.
from openai import OpenAIclient = OpenAI( base_url="http://localhost:3000/openai/v1", api_key="unused",)response = client.chat.completions.create( model="tensorzero::function_name::extract_data", messages=[ { "role": "user", "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com", } ],)
You can point the OpenAI Node SDK to a TensorZero Gateway to generate structured outputs.
The response content is the JSON string generated by the model.
import OpenAI from "openai";const client = new OpenAI({ baseURL: "http://localhost:3000/openai/v1", apiKey: "unused",});const response = await client.chat.completions.create({ model: "tensorzero::function_name::extract_data", messages: [ { role: "user", content: "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com", }, ],});
You can call the TensorZero Gateway’s OpenAI-compatible endpoint directly with curl.
The response content is the JSON string generated by the model.
curl -X POST "http://localhost:3000/openai/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "tensorzero::function_name::extract_data", "messages": [ { "role": "user", "content": "Hi, I'\''m Sarah Johnson and you can reach me at sarah.j@example.com" } ] }'
While we recommend specifying a fixed schema in the configuration whenever possible, you can provide the output schema dynamically at inference time if your use case demands it.See response_format in the Inference (OpenAI) API Reference.You can also override json_mode at inference time if necessary.
Dynamic inference parameters like json_mode apply to specific variant types.
Unless you’re using an advanced variant type, the variant type will be chat_completion.
For the direct Anthropic provider, json_mode = "strict" automatically uses Anthropic’s structured outputs feature for guaranteed schema compliance.AWS Bedrock and GCP Vertex AI do not support Anthropic’s structured outputs, so json_mode = "strict" falls back to prompt-based JSON mode. Use json_mode = "tool" for more reliable schema compliance on these providers.
For Anthropic’s extended thinking models, only json_mode = "strict" (direct Anthropic) or json_mode = "off" are compatible. Other modes use prefill or forced tool use, which conflict with thinking.
GCP Vertex AI Gemini and Google AI Studio support structured outputs, but only support a subset of the JSON Schema specification.
TensorZero automatically handles some known limitations, but certain output schemas will still be rejected by the model provider.
Refer to the Google documentation for details on supported JSON Schema features.
Some model providers (e.g. OpenAI, Google) support strictly enforcing output schemas natively, but others (e.g. AWS Bedrock) do not.For providers without native support, you can still generate structured outputs with json_mode = "tool".
TensorZero converts your output schema into a tool call, then transforms the tool response back into JSON output.You can set json_mode = "tool" in your configuration file or at inference time.