Usage & analytics

See what you spend, the mix of request types your traffic generates, and which application made each call — with no client setup beyond an optional header.

Usage & spend

The Overview summarizes requests, input and output tokens, and spend over a time window, each with a trend against the prior period. When caching is active it adds a savings strip and a cache timeline.

Analytics by request type

DI Analytics breaks your traffic down by the request type each call was classified as — the same vocabulary returned per request in the X-DI-Request-Type header. The model behind each type stays private; the type itself is yours to analyze.

vision document long code json reason flash pro

Traces

Traces lists individual requests — timestamp, endpoint, the model id you sent, request type, tokens, cost, cache breakdown, and latency — and each row expands to the raw request and response. Consistent with the rest of the platform, a trace shows your call and how it was classified, never the backend model that served it.

Per-application attribution

Usage is automatically grouped by application so one key can power several apps and still report separately. The label comes from X-Title if you send it, otherwise the request’s Referer host, falling back to the API key’s name — so attribution works even with zero setup.

client = OpenAI(
    api_key="llm_live_...",
    base_url="https://app.directinference.com/di/v1",
    default_headers={"X-Title": "billing-worker"},   # names this app in analytics
)

const client = new OpenAI({
  apiKey: "llm_live_...",
  baseURL: "https://app.directinference.com/di/v1",
  defaultHeaders: { "X-Title": "billing-worker" }, // names this app in analytics
});

curl https://app.directinference.com/di/v1/chat/completions \
  -H "Authorization: Bearer llm_live_..." \
  -H "X-Title: billing-worker" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-5.5-mini", "messages": [{ "role": "user", "content": "..." }] }'