# Direct Inference > Direct Inference is a zero-knowledge inference endpoint. Point an OpenAI-, > Anthropic-, or Gemini-compatible client at one base URL, keep the model id your > app already sends, and every request is classified by its shape and fulfilled on > a capable path. The response echoes your model id back; which model, provider, > or version served the request stays hidden. Only the request type is exposed. Base URL: https://app.directinference.com/di/v1 Auth: send your Direct Inference API key as the SDK's API key / Bearer token. ## Quickstart OpenAI-compatible (Python): from openai import OpenAI client = OpenAI( base_url="https://app.directinference.com/di/v1", api_key="YOUR_DIRECT_INFERENCE_KEY", ) resp = client.chat.completions.create( model="gpt-5.5-mini", # keep your own model id; it is echoed back messages=[{"role": "user", "content": "Summarize this thread."}], ) The same base URL also accepts the Anthropic Messages shape and the Gemini generateContent shape. Streaming, tool use, vision, PDFs, and structured output all pass through. ## Request types (classified from the request shape) - vision: image content in the request -> a vision-capable model. - document: PDF or file input -> document-capable handling. - long: input beyond the standard context window -> a long-context path. - code: tool definitions, diffs, stack traces, repo paths -> coding/tool strength. - json: a response/output JSON schema is set -> a schema-reliable model. - reason: multi-step reasoning in the prompt -> a reasoning model. - flash: simple request at low effort -> fast and cheap. - pro: everything else (default) -> a strong all-rounder. Capability outranks the model name: a PDF or image sent to a "mini" id still gets a capable model. Unknown, legacy, and future ids resolve instead of erroring. ## Effort (optional cost/quality hint) Send the X-DI-Effort header or an ?effort= query param. Levels: minimal, low, medium (default), high, xhigh. Effort tunes the serving choice; request shape still decides the needed capability. resp = client.chat.completions.create( model="gpt-5.5", messages=[{"role": "user", "content": "Plan a database migration."}], extra_headers={"X-DI-Effort": "high"}, ) ## Links - Product: https://directinference.com/ - Why Direct Inference (zero-knowledge vs. transparent routers): https://directinference.com/why - Developers (quickstart, request types, compatibility): https://directinference.com/developers - Pricing: https://directinference.com/pricing - Security: https://directinference.com/security - Portal (create an API key): https://app.directinference.com - Full machine-readable docs: https://directinference.com/llms-full.txt