New
2026.1.5
[!CAUTION] Breaking Changes
- TensorZero will normalize the reported
usagefrom different model providers. Moving forward,input_tokensandoutput_tokensinclude all token variations (provider prompt caching, reasoning, etc.), just like OpenAI. Tokens cached by TensorZero remain excluded. You can still access the raw usage reported by providers withinclude_raw_usage.
[!WARNING] Planned Deprecations
- Migrate
include_original_responsetoinclude_raw_response. For advanced variant types, the former only returned the last model inference, whereas the latter returns every model inference with associated metadata.- Migrate
allow_auto_detect_region = truetoregion = "sdk"when configuring AWS model providers. The behavior is identical.- Provide the proper API base rather than the full endpoint when configuring custom Anthropic providers. Example:
- Before:
api_base = "https://YOUR-RESOURCE-NAME.services.ai.azure.com/anthropic/v1/messages"