Guides

Tool calls

Tool calls let a model invoke external functions you define — fetching data, calling APIs, running code on your side. You describe the available tools in the tools parameter; the model returns a structured request to call one; your client runs the function and sends the result back as a tool message; the model produces a natural-language reply.

The OpenAI surface and Anthropic surface both accept tool definitions. Reliability across models is uneven — read the matrix below before designing around tool calls on ezrouter.

Per-model reliability

Probe (2026-05-27) sent the same get_weather tool spec to every catalog model and measured whether the response carried a parseable tool_calls delta:

Model	Surface	`finish_reason: tool_calls`	Parseable tool_call delta
`claude-opus-4-7`	OpenAI	✓	✓
`claude-sonnet-4-6`	OpenAI	✓	✗ (delta missing)
`claude-haiku-4-5`	OpenAI	✓	✗ (delta missing)
`deepseek-v4-pro`	OpenAI	✓	✗ (delta missing)
`deepseek-v4-flash`	OpenAI	✓	✗ (delta missing)
`glm-5.1`	OpenAI	✓	✗ (delta missing)
`kimi-k2.6`	OpenAI	(no)	n/a

Only claude-opus-4-7 reliably emits both the finish_reason: tool_calls signal and a parseable delta.tool_calls payload on the OpenAI surface. The other 5 claude/deepseek/glm models set the finish_reason without emitting the structured tool_calls field — your client sees the model "wants to call a tool" but cannot tell which one.

This is tracked as a gateway bug (GW-001, critical). Until backend ships a fix:

**For tool-using agents, use claude-opus-4-7 on the OpenAI

surface, or use any claude model on the Anthropic surface.**

The Anthropic surface uses Anthropic's native tool_use content

blocks, which round-trip cleanly across the claude family.

OpenAI-surface example

This example assumes claude-opus-4-7. Substituting other models is not recommended until GW-001 is fixed.

python

from openai import OpenAI
import os, json

client = OpenAI(
    api_key=os.environ["EZROUTER_API_KEY"],
    base_url="https://www.ezrouter.dev/v1",
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather at a location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and state, e.g. San Francisco, CA",
                },
            },
            "required": ["location"],
        },
    },
}]

def call_weather_api(location: str) -> str:
    # In a real app, call your weather service here.
    return "24C, partly cloudy"

messages = [{"role": "user", "content": "How's the weather in Hangzhou?"}]

resp = client.chat.completions.create(
    model="claude-opus-4-7",
    messages=messages,
    tools=tools,
)
msg = resp.choices[0].message

if msg.tool_calls:
    tool_call = msg.tool_calls[0]
    args = json.loads(tool_call.function.arguments)
    result = call_weather_api(**args)

    messages.append(msg)  # the assistant's tool-call message
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": result,
    })

    final = client.chat.completions.create(
        model="claude-opus-4-7",
        messages=messages,
        tools=tools,
    )
    print(final.choices[0].message.content)

The execution flow:

User: "How's the weather in Hangzhou?"
Model: returns a structured request to call

get_weather(location="Hangzhou").

Client: runs the function and appends the result to messages

as a role: "tool" entry.

Model: receives the tool result and produces the natural-language

reply.

The function itself is your code — the model does not execute anything. It only emits the structured request describing what it wants.

Anthropic-surface example

For tool use across the claude family, the Anthropic surface is more reliable:

python

import anthropic, os

client = anthropic.Anthropic(
    base_url="https://www.ezrouter.dev/anthropic",
    api_key=os.environ["EZROUTER_API_KEY"],
)

tools = [{
    "name": "get_weather",
    "description": "Get the current weather at a location.",
    "input_schema": {
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
}]

resp = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "How's the weather in Hangzhou?"}],
)

for block in resp.content:
    if block.type == "tool_use":
        result = call_weather_api(**block.input)
        # send back as a tool_result block in the next request

The Anthropic surface uses input_schema (not parameters) on the tool definition, and emits a tool_use content block in the response instead of a separate tool_calls array. The model loop is otherwise the same.

`tool_choice`

Control whether the model is required to call a tool:

Value	Behavior
`"none"`	Disable tools for this call; force a text reply.
`"auto"`	Default. Model decides whether to call a tool.
`"required"` (OpenAI) / `"any"` (Anthropic)	Model must call some tool.
`{"type": "function", "function": {"name": "X"}}` (OpenAI) / `{"type": "tool", "name": "X"}` (Anthropic)	Force the model to call a specific tool.

The Anthropic surface's disable_parallel_tool_use modifier on tool_choice is ignored by the gateway; design for the possibility of multiple tool calls per turn.

Strict mode

The OpenAI surface accepts strict: true on a tool's function object — the model is supposed to emit arguments that exactly match the JSON schema. ezrouter forwards the flag to upstream providers that support it (modern claude models, recent openai-family, deepseek-family). Behavior on older or unsupporting upstreams is undocumented; validate client-side regardless.

For mission-critical schemas, do both: pass strict: true, and validate the parsed arguments against a JSON schema (or pydantic) before invoking your function.

Multi-turn tool sessions

The model can call multiple tools across multiple turns. After each tool result, send the full message history (including the assistant's tool-call message and the result) back to the model. It may decide to call another tool, or to produce the final answer.

A defensive loop bound: stop after N consecutive tool calls without a final text reply. Without a bound, a misbehaving prompt can loop forever.

Parallel tool calls

A single assistant turn may contain multiple tool_calls entries (OpenAI surface) or multiple tool_use blocks (Anthropic surface). Your client should iterate over all of them and append one role: "tool" message per call before the next request.

Common failure modes

tool_calls is null but finish_reason == "tool_calls". You

are on a non-opus claude/deepseek/glm model. Switch to claude-opus-4-7 or move to the Anthropic surface (GW-001).

Arguments do not parse as JSON. Models occasionally emit

malformed JSON, especially without strict: true. Catch the parse error and either retry, or fall back to a regex extraction pass.

Model invents tool names. Validate tool_call.function.name

against your tools list before dispatch.

JSON mode — structured output without function

calling.

Thinking mode — combining thinking with

tool calls.

API reference —

the tools parameter shape.