Getting started
Your first API call
ezrouter is a unified API that routes requests to many large language models — claude, gpt, deepseek, glm, kimi, and others — behind a single endpoint and a single API key. The API is compatible with the OpenAI and Anthropic SDKs: point your existing client at our base URL and most code works unchanged.
Configuration
| Parameter | Value |
|---|---|
| Base URL (OpenAI-compatible) | https://www.ezrouter.dev/v1 |
| Base URL (Anthropic-compatible) | https://www.ezrouter.dev/anthropic/v1 |
| Auth header (OpenAI) | Authorization: Bearer ${API_KEY} |
| Auth header (Anthropic) | x-api-key: ${API_KEY} |
| API key source | Generate from dashboard |
| Model parameter | Any model ID from GET /v1/models |
The catalog is gateway-global — every key sees the same models. As of this writing the catalog includes claude (haiku / sonnet / opus 4.x), gpt-5.x, deepseek-v4 (flash / pro), glm-5.1, and kimi-k2.6. Always query /v1/models for the live list rather than hard-coding model IDs.
Pick a surface
ezrouter exposes two parallel API surfaces against the same model catalog. Choose based on what client you already have:
| If you currently use… | Pick this surface | Why |
|---|---|---|
| OpenAI Python / Node SDK | OpenAI-compatible (/v1/...) | Drop-in; override base_url only. |
| Anthropic SDK or Claude Code | Anthropic-compatible (/anthropic/v1/...) | Drop-in; preserves Anthropic-native semantics including tool use. |
| Nothing yet | Either works. OpenAI-compat is broader (covers all models in the catalog). Anthropic-compat is more reliable for tool use on claude models. | See Notable behaviors below. |
You can mix and match — the same API key works on both surfaces.
Make your first request
A minimal chat completion against claude-sonnet-4-6, via the OpenAI-compatible surface:
curl
curl https://www.ezrouter.dev/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${EZROUTER_API_KEY}" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}'Python (OpenAI SDK)
# pip install openai
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["EZROUTER_API_KEY"],
base_url="https://www.ezrouter.dev/v1",
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
],
)
print(response.choices[0].message.content)Node.js (OpenAI SDK)
// npm install openai
import OpenAI from "openai";
const openai = new OpenAI({
baseURL: "https://www.ezrouter.dev/v1",
apiKey: process.env.EZROUTER_API_KEY,
});
const completion = await openai.chat.completions.create({
model: "claude-sonnet-4-6",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hello!" },
],
});
console.log(completion.choices[0].message.content);For Anthropic-SDK examples and the Anthropic surface, see the ezrouter API reference.
Notable behaviors
Before integrating in production, read these gateway-wide behaviors — they differ from a stock OpenAI or Anthropic deployment in ways that will trip up naive clients:
- The OpenAI surface always returns SSE. Even without
stream: true, responses arrive as data: {...} chunks terminated by data: [DONE]. Clients calling response.json() directly will fail; use an SSE parser. See POST /v1/chat/completions → Differences from OpenAI.
- No
429or rate-limit headers. The gateway uses silent
upstream queueing under load instead of returning explicit backpressure. Build clients with timeouts and latency-based backoff, not 429-retry handlers. See Rate limits.
- **Tool-call extraction is reliable on
claude-opus-4-7via the
OpenAI surface today; for other models, prefer the Anthropic surface for agent applications.** See POST /v1/chat/completions → Tool calls.
- Error envelope shape is ezrouter-specific and uses three
distinct error.type taxonomies. See Error codes.
Where to go next
| Goal | Read |
|---|---|
Understand all POST /v1/chat/completions parameters | Create chat completion |
| List the model catalog from code | List models |
| Wire up an agent / coding assistant | Agent integrations |
| Understand error envelopes | Error codes |
| Understand capacity and throttling | Rate limits |
| Pricing | Pricing |
| Track per-end-user usage | Rate limits → user_id |