How to generate structured outputs

TensorZero Functions come in two flavors:

chat: the default choice for most LLM chat completion use cases
json: a specialized function type when your goal is generating structured outputs

As a rule of thumb, you should use JSON functions if you have a single, well-defined output schema. If you need more flexibility (e.g. letting the model pick between multiple tools, or whether to pick a tool at all), then Chat Functions with tool use might be a better fit.

Generate structured outputs with a static schema

Let’s create a JSON function for one of its typical use cases: data extraction.

We provide complete code examples on GitHub.

Configure your JSON function

Create a configuration file that defines your JSON function with the output schema and JSON mode. If you don’t specify an output_schema, the gateway will default to accepting any valid JSON output.

tensorzero.toml

[functions.extract_data]
type = "json"
output_schema = "output_schema.json"  # optional

[functions.extract_data.variants.baseline]
type = "chat_completion"
model = "openai::gpt-5-mini"
system_template = "system_template.minijinja"
json_mode = "strict"

The field json_mode can be one of the following: off, on, strict, or implicit_tool. The implicit_tool strategy is a custom TensorZero implementation that leverages tool use under the hood for generating JSON. See Configuration Reference for details.

Use "strict" mode for providers that support it (e.g. OpenAI) or "implicit_tool" for others (e.g. Anthropic).

Configure your output schema

If you choose to specify a schema, place it in the relevant file:

output_schema.json

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": {
      "type": ["string", "null"],
      "description": "The customer's full name"
    },
    "email": {
      "type": ["string", "null"],
      "description": "The customer's email address"
    }
  },
  "required": ["name", "email"],
  "additionalProperties": false
}

Create your prompt template

Create a template that instructs the model to extract the information you need.

system_template.minijinja

You are a helpful AI assistant that extracts customer information from messages.

Extract the customer's name and email address if present. Use null for any fields that are not found.

Your output should be a JSON object with the following schema:

{
  "name": string or null,
  "email": string or null
}

---

Examples:

User: Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com
Assistant: {"name": "Sarah Johnson", "email": "sarah.j@example.com"}

User: My email is contact@company.com
Assistant: {"name": null, "email": "contact@company.com"}

User: This is John Doe reaching out
Assistant: {"name": "John Doe", "email": null}

Including examples in your prompt helps the model understand the expected output format and improves accuracy.

Call the function

Python
Python (OpenAI SDK)
Node (OpenAI SDK)
HTTP

When using the TensorZero SDK, the response will include raw and parsed values. The parsed field contains the validated JSON object. If the output doesn’t match the schema or isn’t valid JSON, parsed will be None and you can fall back to the raw string output.

from tensorzero import TensorZeroGateway

t0 = TensorZeroGateway.build_http(gateway_url="http://localhost:3000")

response = t0.inference(
    function_name="extract_data",
    input={
        "messages": [
            {
                "role": "user",
                "content": "Hi, I'm Sarah Johnson and you can reach me at sarah.j@example.com",
            }
        ]
    },
)

Sample Response

JsonInferenceResponse(
    inference_id=UUID('019a78dc-0045-79e2-9629-cbcd47674abe'),
    episode_id=UUID('019a78dc-0045-79e2-9629-cbdaf9d830bd'),
    variant_name='baseline',
    output=JsonInferenceOutput(
        raw='{"name":"Sarah Johnson","email":"sarah.j@example.com"}',
        parsed={'name': 'Sarah Johnson', 'email': 'sarah.j@example.com'}
    ),
    usage=Usage(input_tokens=252, output_tokens=26),
    finish_reason=<FinishReason.STOP: 'stop'>,
    original_response=None
)

Generate structured outputs with a dynamic schema

While we recommend specifying a fixed schema whenever possible, you can provide the output schema dynamically at inference time if your use case demands it. See output_schema in the Inference API Reference or response_format in the Inference (OpenAI) API Reference. You can also override json_mode at inference time if necessary.

Introduction

Gateway

Optimization

Evaluations

Experimentation

Deployment

Operations

How to generate structured outputs

Generate structured outputs with a static schema

Generate structured outputs with a dynamic schema

Introduction

Gateway

Optimization

Evaluations

Experimentation

Deployment

Operations

​Generate structured outputs with a static schema

​Generate structured outputs with a dynamic schema

Generate structured outputs with a static schema

Generate structured outputs with a dynamic schema