> ## Documentation Index
> Fetch the complete documentation index at: https://documentation.nozle.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# LLM Auto-Capture

> Automatically track LLM token usage for billing

Nozle's LLM wrappers intercept OpenAI and Anthropic API calls, extract token usage, and automatically send billing events — no manual tracking code needed.

Cost calculation happens **server-side** via the Go engine's [cost model](/guides/margin/cost-models) system. The SDK only sends raw token counts.

## OpenAI

```bash theme={null}
npm install openai  # peer dependency, >=4.0.0
```

```typescript theme={null}
import OpenAI from 'openai';
import { Nozle, wrapOpenAI } from '@nozle-js/node';

const nozle = new Nozle({ apiKey: 'sk_live_...' });
const openai = wrapOpenAI(new OpenAI(), nozle, {
  customerId: 'cust_123',
  feature: 'code_completion',   // optional: tag for entitlement tracking
  metricCode: 'llm_tokens',     // optional: defaults to "llm_tokens"
});

// Use OpenAI normally — tracking happens automatically
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});
```

### Streaming

Streaming is fully supported. Usage is captured from the final chunk:

```typescript theme={null}
const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
// Token usage is automatically tracked after the stream completes
```

## Anthropic

```bash theme={null}
npm install @anthropic-ai/sdk  # peer dependency, >=0.30.0
```

```typescript theme={null}
import Anthropic from '@anthropic-ai/sdk';
import { Nozle, wrapAnthropic } from '@nozle-js/node';

const nozle = new Nozle({ apiKey: 'sk_live_...' });
const anthropic = wrapAnthropic(new Anthropic(), nozle, {
  customerId: 'cust_123',
  feature: 'code_completion',
});

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});
```

## WrapOptions

| Field        | Type   | Required | Description                                    |
| ------------ | ------ | -------- | ---------------------------------------------- |
| `customerId` | string | Yes      | Customer to bill for this usage                |
| `metricCode` | string | No       | Billable metric code (default: `"llm_tokens"`) |
| `feature`    | string | No       | Feature tag for entitlement tracking           |

## What gets tracked

Each LLM call sends a single event via `nozle.track()` with these properties:

| Property        | Source         | Description                                            |
| --------------- | -------------- | ------------------------------------------------------ |
| `model`         | Response       | Model name (e.g. `gpt-4o`, `claude-sonnet-4-20250514`) |
| `input_tokens`  | Response usage | Prompt/input token count                               |
| `output_tokens` | Response usage | Completion/output token count                          |
| `latency_ms`    | Measured       | End-to-end call duration                               |
| `feature`       | WrapOptions    | Feature tag (if provided)                              |

<Info>
  The SDK does **not** calculate costs. The Go engine matches the `model` property against your [cost models](/guides/margin/cost-models) with `per_model` type and calculates `cost_cents` server-side. Make sure you have a cost model configured for the `llm_tokens` metric with rates for your models.
</Info>

## Privacy

Wrappers **never** capture prompt content or completion text — only metadata (model name, token counts, latency). No PII passes through the billing pipeline.

## Manual tracking

If you prefer manual control or use a provider without a wrapper, you can track LLM usage directly:

```typescript theme={null}
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages,
});

await nozle.track('cust_123', 'llm_tokens', {
  model: response.model,
  input_tokens: response.usage?.prompt_tokens ?? 0,
  output_tokens: response.usage?.completion_tokens ?? 0,
});
```
