Migrate to DirectInference
Already calling a model vendor directly, or through another service? Switching is a base-URL change — your SDK, model ids, and request code stay exactly as they are.
Change the base URL
Section titled “Change the base URL”DirectInference speaks the OpenAI, Anthropic, and Gemini wire formats, so you keep your existing client. Point it at the matching base URL and authenticate with your llm_live_… key.
| Your client | Base URL |
|---|---|
| OpenAI-compatible — openai, LangChain, LiteLLM, @ai-sdk/openai | https://app.directinference.com/di/v1 |
| Anthropic — anthropic, @anthropic-ai/sdk, @ai-sdk/anthropic | https://app.directinference.com/di— the SDK appends /v1/messages itself |
| Gemini — google-genai, @google/genai | https://app.directinference.com/di— the SDK appends /v1beta/models/… itself |
| Gemini — Vercel @ai-sdk/google | https://app.directinference.com/di/v1beta— this provider already includes /v1beta |
The OpenAI-compatible swap, in full — the diff is two lines:
from openai import OpenAI
client = OpenAI( api_key="llm_live_...", # your DirectInference key base_url="https://app.directinference.com/di/v1", # the only required change)
# Everything below is untouched — same model id, same parameters.resp = client.chat.completions.create( model="gpt-5.5-mini", messages=[{"role": "user", "content": "Ship it."}],)import OpenAI from "openai";
const client = new OpenAI({ apiKey: "llm_live_...", // your DirectInference key baseURL: "https://app.directinference.com/di/v1", // the only required change});
// Everything below is untouched — same model id, same parameters.const resp = await client.chat.completions.create({ model: "gpt-5.5-mini", messages: [{ role: "user", content: "Ship it." }],});# Point the host at DirectInference and send your key. Nothing else changes.curl https://app.directinference.com/di/v1/chat/completions \ -H "Authorization: Bearer llm_live_..." \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.5-mini", "messages": [{ "role": "user", "content": "Ship it." }] }'client := openai.NewClient( option.WithAPIKey("llm_live_..."), // your DirectInference key option.WithBaseURL("https://app.directinference.com/di/v1"), // the only required change)
// Everything below is untouched — same model id, same parameters.resp, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{ Model: "gpt-5.5-mini", Messages: []openai.ChatCompletionMessageParamUnion{ openai.UserMessage("Ship it."), },})Per-surface details (auth headers, streaming, tools, documents) live under API surfaces.
What stays the same
Section titled “What stays the same”The migration is deliberately boring. None of your application logic has to move.
What changes for the better
Section titled “What changes for the better”The difference from a transparent service is the point: you stop managing models and start getting outcomes.
| A typical service or provider | DirectInference |
|---|---|
| Returns the model it picked; you keep tracking model slugs. | Echoes your id back. The model stays hidden and can change at any time — your code never does. |
| You build model-selection logic or assemble a model pool. | Nothing to build. Each call is classified from its shape automatically. |
| A 0–10 cost-vs-quality slider to tune. | One optional effort knob (auto by default). Capability needs always win. |
| Pin a session to a model to preserve its cache. | The endpoint adapts per request; caching works with no session to pin. |
| A separate key and bill per provider. | One key, one balance, across every surface. |
| Maintain an allowlist of valid model names. | Any id resolves — legacy, renamed, or not-yet-released — and never 404s. |
Migration checklist
Section titled “Migration checklist”- Issue a key on the API Keys page.
- Point your client’s base URL at the surface you use (table above).
- Swap in the new key; keep every
modelid your app already sends. - Optional: set
X-DI-Effortto bias cost vs. quality — see Effort. - Optional: set
X-Titleto segment usage by app — see Usage & analytics. - Set a spend cap and you’re live.