Cookbook
Chat completion in Python
ezrouter does not ship its own Python SDK. Use the official openai package with a base_url override.
Install
pip install openaiMinimal example
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["EZROUTER_API_KEY"],
base_url="https://www.ezrouter.dev/v1",
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
],
)
print(response.choices[0].message.content)Set EZROUTER_API_KEY in your environment to a key from the dashboard.
Streaming
ezrouter always returns SSE on the chat-completions endpoint. The openai SDK abstracts this for you when you pass stream=True:
stream = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Count to five."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
print()When you call without stream=True, the SDK consumes the SSE stream internally and assembles the final response object — you get the non-streaming ergonomics on top of the always-streaming surface.
Multi-turn
Append each completed turn to a running messages list:
history = [
{"role": "system", "content": "You are a helpful assistant."},
]
def ask(user_input: str) -> str:
history.append({"role": "user", "content": user_input})
resp = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=history,
)
reply = resp.choices[0].message.content
history.append({"role": "assistant", "content": reply})
return reply
print(ask("What is 2+2?"))
print(ask("And 3+3?"))Reading usage
The OpenAI SDK exposes the standard token counts; ezrouter extensions hang off the same object as a dict-like:
resp = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.usage.prompt_tokens, resp.usage.completion_tokens)
# Cached portion, when present
cached = getattr(resp.usage, "prompt_tokens_details", None)
if cached:
print("cached:", cached.cached_tokens)The Anthropic-aliased output_tokens field on usage is unreliable on this surface (often reads 0); read completion_tokens instead.
Error handling
Catch openai.APIError and switch on HTTP status. The error envelope ezrouter returns is documented in error codes. Do not retry on 429 — the gateway does not emit 429; you may see a 5xx during a gateway redeploy, which is the correct retry target.
from openai import APIError, APIConnectionError
import time
def safe_complete(**kwargs):
for attempt in range(5):
try:
return client.chat.completions.create(**kwargs)
except APIConnectionError:
time.sleep(2 ** attempt)
except APIError as e:
if 500 <= e.status_code < 600:
time.sleep(2 ** attempt)
continue
raise
raise RuntimeError("exceeded retry budget")Anthropic surface alternative
For claude models with extended thinking or prompt caching, the Anthropic surface gives a richer feature set. See anthropic-api guide.
Next steps
- Node.js example — same call from JavaScript.
- curl example — bare-metal HTTP without an SDK.
- API reference —
every parameter explained.