Example
You can also find the runnable code for this example on GitHub.
inputs
is equal to the input
field in a regular inference request.
batch_id
as well as inference_ids
and episode_ids
for each inference in the batch.
batch_id
to poll for the status of the job or retrieve the results using the GET /batch_inference/{batch_id}
endpoint.
status
field.
status
field and the inferences
field.
Each inference object is the same as the response from a regular inference request.
Technical Notes
- Observability
- For now, pending batch inference jobs are not shown in the TensorZero UI.
You can find the relevant information in the
BatchRequest
andBatchModelInference
tables on ClickHouse. See Data Model for more information. - Inferences from completed batch inference jobs are shown in the UI alongside regular inferences.
- For now, pending batch inference jobs are not shown in the TensorZero UI.
You can find the relevant information in the
- Experimentation
- The gateway samples the same variant for the entire batch.
- Python Client
- The TensorZero Python client doesn’t natively support batch inference yet. You’ll need to submit batch requests using HTTP requests, as shown above.