Migrate to DirectInference

Already calling a model vendor directly, or through another service? Switching is a base-URL change — your SDK, model ids, and request code stay exactly as they are.

Change the base URL

DirectInference speaks the OpenAI, Anthropic, and Gemini wire formats, so you keep your existing client. Point it at the matching base URL and authenticate with your llm_live_… key.

Your client	Base URL
OpenAI-compatible — openai, LangChain, LiteLLM, @ai-sdk/openai	`https://app.directinference.com/di/v1`
Anthropic — anthropic, @anthropic-ai/sdk, @ai-sdk/anthropic	`https://app.directinference.com/di` — the SDK appends /v1/messages itself
Gemini — google-genai, @google/genai	`https://app.directinference.com/di` — the SDK appends /v1beta/models/… itself
Gemini — Vercel @ai-sdk/google	`https://app.directinference.com/di/v1beta` — this provider already includes /v1beta

The OpenAI-compatible swap, in full — the diff is two lines:

from openai import OpenAI

client = OpenAI(
    api_key="llm_live_...",                            # your DirectInference key
    base_url="https://app.directinference.com/di/v1",  # the only required change
)

# Everything below is untouched — same model id, same parameters.
resp = client.chat.completions.create(
    model="gpt-5.5-mini",
    messages=[{"role": "user", "content": "Ship it."}],
)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "llm_live_...",                            // your DirectInference key
  baseURL: "https://app.directinference.com/di/v1", // the only required change
});

// Everything below is untouched — same model id, same parameters.
const resp = await client.chat.completions.create({
  model: "gpt-5.5-mini",
  messages: [{ role: "user", content: "Ship it." }],
});

# Point the host at DirectInference and send your key. Nothing else changes.
curl https://app.directinference.com/di/v1/chat/completions \
  -H "Authorization: Bearer llm_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5-mini",
    "messages": [{ "role": "user", "content": "Ship it." }]
  }'

client := openai.NewClient(
  option.WithAPIKey("llm_live_..."),                            // your DirectInference key
  option.WithBaseURL("https://app.directinference.com/di/v1"), // the only required change
)

// Everything below is untouched — same model id, same parameters.
resp, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
  Model: "gpt-5.5-mini",
  Messages: []openai.ChatCompletionMessageParamUnion{
    openai.UserMessage("Ship it."),
  },
})

Per-surface details (auth headers, streaming, tools, documents) live under API surfaces.

What stays the same

The migration is deliberately boring. None of your application logic has to move.

What changes for the better

The difference from a transparent service is the point: you stop managing models and start getting outcomes.

A typical service or provider	DirectInference
Returns the model it picked; you keep tracking model slugs.	Echoes your id back. The model stays hidden and can change at any time — your code never does.
You build model-selection logic or assemble a model pool.	Nothing to build. Each call is classified from its shape automatically.
A 0–10 cost-vs-quality slider to tune.	One optional effort knob (auto by default). Capability needs always win.
Pin a session to a model to preserve its cache.	The endpoint adapts per request; caching works with no session to pin.
A separate key and bill per provider.	One key, one balance, across every surface.
Maintain an allowlist of valid model names.	Any id resolves — legacy, renamed, or not-yet-released — and never 404s.

Migration checklist

Issue a key on the API Keys page.
Point your client’s base URL at the surface you use (table above).
Swap in the new key; keep every model id your app already sends.
Optional: set X-DI-Effort to bias cost vs. quality — see Effort.
Optional: set X-Title to segment usage by app — see Usage & analytics.
Set a spend cap and you’re live.