Skip to content

Response headers

DirectInference returns one header that lets you observe how a request was handled — without ever exposing the model behind it.

Every response on the DI surface carries an X-DI-Request-Type header naming the request type the call was classified as — pro, code, vision, document, long, json, reason, or flash. It is a fact about your own request (a PDF makes it a document request), so it is safe to surface; the model that served it stays private.

# The OpenAI SDK exposes raw response headers via with_raw_response.
raw = client.chat.completions.with_raw_response.create(
model="gpt-5.5-mini",
messages=[{"role": "user", "content": "Refactor this function."}],
)
print(raw.headers.get("x-di-request-type")) # -> "code"
resp = raw.parse() # the usual typed completion
print(resp.choices[0].message.content)

There is no resolved-model header. Other services hand back the model they picked; DirectInference does not — not in a header, not in the body, not in a trailer. The response echoes the model id you sent and nothing about the candidate, provider, version, cost, or trace that produced it.