OpenAI
Streaming
Streaming is fully supported. Usage is captured from the final chunk:Anthropic
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
customer_id | str | Yes | Customer to bill for this usage |
metric_code | str | No | Billable metric code (default: "llm_tokens") |
feature | str | No | Feature tag for entitlement tracking |
What gets tracked
Each LLM call sends a single event vianozle.track() with these properties:
| Property | Source | Description |
|---|---|---|
model | Response | Model name (e.g. gpt-4o, claude-sonnet-4-20250514) |
input_tokens | Response usage | Prompt/input token count |
output_tokens | Response usage | Completion/output token count |
latency_ms | Measured | End-to-end call duration |
feature | wrap options | Feature tag (if provided) |
The SDK does not calculate costs. The Go engine matches the
model property against your cost models with per_model type and calculates cost_cents server-side. Make sure you have a cost model configured for the llm_tokens metric with rates for your models.