Your model id is echoed back
Responses carry the exact model string you sent. Logging, dashboards, and eval pipelines keep working unchanged.
DirectInference is a drop-in endpoint. Point your existing OpenAI, Anthropic, or Gemini client at it, keep the model string you already send, and each request is served by the model best suited to its shape.
There is one model — di. You never pick a backing model: every call is classified into a request type by its shape, and each request type invokes the model best suited to it. The request type is visible information about your call; the model, candidate, and provider that serve it always stay private.
| Client | Base URL |
|---|---|
| OpenAI-compatible | https://app.directinference.com/di/v1 |
| Anthropic SDK | https://app.directinference.com/di — the SDK appends /v1/messages itself |
| Gemini (Google GenAI SDK) | https://app.directinference.com/di — the SDK appends /v1beta/models/… |
Each surface is documented in full under API surfaces. The authentication header differs per client — see Authentication.
The smallest possible call over the OpenAI-compatible surface. Only two things change from a normal integration: the base URL and the API key.
from openai import OpenAI
client = OpenAI( api_key="llm_live_...", base_url="https://app.directinference.com/di/v1",)
resp = client.chat.completions.create( model="gpt-5.5-mini", # keep the id your app already sends messages=[{"role": "user", "content": "Say hello from DirectInference."}],)
print(resp.choices[0].message.content)print(resp.model) # -> "gpt-5.5-mini" (echoed back)import OpenAI from "openai";
const client = new OpenAI({ apiKey: "llm_live_...", baseURL: "https://app.directinference.com/di/v1",});
const resp = await client.chat.completions.create({ model: "gpt-5.5-mini", // keep the id your app already sends messages: [{ role: "user", content: "Say hello from DirectInference." }],});
console.log(resp.choices[0].message.content);console.log(resp.model); // -> "gpt-5.5-mini" (echoed back)curl https://app.directinference.com/di/v1/chat/completions \ -H "Authorization: Bearer llm_live_..." \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.5-mini", "messages": [{ "role": "user", "content": "Say hello from DirectInference." }] }'package main
import ( "context" "fmt"
"github.com/openai/openai-go" "github.com/openai/openai-go/option")
func main() { client := openai.NewClient( option.WithAPIKey("llm_live_..."), option.WithBaseURL("https://app.directinference.com/di/v1"), )
resp, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{ Model: "gpt-5.5-mini", // keep the id your app already sends Messages: []openai.ChatCompletionMessageParamUnion{ openai.UserMessage("Say hello from DirectInference."), }, }) if err != nil { panic(err) }
fmt.Println(resp.Choices[0].Message.Content) fmt.Println(resp.Model) // -> "gpt-5.5-mini" (echoed back)}Your model id is echoed back
Responses carry the exact model string you sent. Logging, dashboards, and eval pipelines keep working unchanged.
Unknown ids never error
Legacy, renamed, and not-yet-released model ids are all accepted. Code written against a model that no longer exists keeps running.
No model identity leaks
Internal classification is hidden. Which model, candidate, and provider served a request stays private by design.
Three SDK shapes, one endpoint
OpenAI chat completions, the Anthropic Messages API, and Gemini generateContent all work. You never commit to a single vendor.
Beyond the three API surfaces, the portal handles request classification, caching, usage analytics, and cost control — all without changing how you call the endpoint.
Ready to send a request?
Issue a key, then walk through your first call: API Keys · Quickstart.