OpenAI Tool Call Arguments: Why They Break
Without strict: true, LLM tool call arguments can be malformed JSON. OpenAI's Structured Outputs (August 2024) solved this for OpenAI models — but multi-provider setups and open models still break. Here's what happens and how to handle it.
The current state of tool call JSON
OpenAI Structured Outputs (August 2024): Setting strict: true in your function definition guarantees that arguments match your JSON Schema. If you use only OpenAI models with strict mode, tool args are reliable.
Without strict mode, or with non-OpenAI models: arguments are best-effort. OpenAI's older docs warned: "the model does not always generate valid JSON." That warning applied to non-strict calls and remains true for open models (Llama, Mistral, Qwen) served through OpenRouter or similar aggregators.
The problem is real when you use multi-provider setups, open-weight models, or older function calling without strict mode. When you call json.loads(tool_call.function.arguments) in those cases, you're trusting the model to produce syntactically perfect JSON. It doesn't always.
What breaks in practice
Consider a tool definition for search_products with a moderately complex schema — nested filters, enums, optional fields. The model returns something that looks like JSON but isn't quite right.
tools = [{
"type": "function",
"function": {
"name": "search_products",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"filters": {
"type": "object",
"properties": {
"category": {"type": "string", "enum": ["electronics", "clothing", "home"]},
"price_max": {"type": "number"},
"in_stock": {"type": "boolean"},
}
},
"sort_by": {"type": "string", "enum": ["relevance", "price", "rating"]},
},
"required": ["query"]
}
}
}]
# tool_call.function.arguments contains: {"query": "wireless headphones", "filters": {"category": "electronics", "price_max": 100, "in_stock": true,}, "sort_by": "rating",} # ^ ^ # trailing comma trailing comma
import json tool_call = response.choices[0].message.tool_calls[0] args = json.loads(tool_call.function.arguments) # json.decoder.JSONDecodeError: Expecting property name: line 1 column 95 (char 94)
Common malformations in tool call arguments include trailing commas, single-quoted strings, unquoted property names, truncated output (JSON cut off mid-key), and hallucinated parameters not in your schema.
When it's worse: non-OpenAI providers
OpenAI's own models (GPT-4o, GPT-4o-mini) have relatively low tool argument failure rates thanks to constrained decoding. But if you route through OpenRouter to use open-weight models, the numbers get much worse.
We tested 288 tool calls across 4 models using schemas of varying complexity. The results:
| Model | Tool calls tested | Malformed args |
|---|---|---|
| meta-llama/llama-3.3-70b-instruct | 72 | 71% |
| mistralai/mixtral-8x22b-instruct | 72 | 44% |
| qwen/qwen-2.5-72b-instruct | 72 | 28% |
| openai/gpt-4o-mini | 72 | 3% |
Complex schemas (nested objects, enums, arrays). See full benchmark methodology and results.
Fix 1: Retry on parse failure
The simplest approach: if json.loads() fails on the tool arguments, retry the entire API call and hope the model generates valid JSON on the next attempt.
import json from openai import OpenAI client = OpenAI() def call_with_tools(messages, tools, max_retries=3): for attempt in range(max_retries): response = client.chat.completions.create( model="gpt-4o", messages=messages, tools=tools, ) tool_call = response.choices[0].message.tool_calls[0] try: args = json.loads(tool_call.function.arguments) return tool_call.function.name, args except json.JSONDecodeError: continue raise ValueError("Failed to get valid tool args after retries")
Fix 2: Manual JSON repair
Strip the most common syntax issues from the arguments string before parsing. This catches trailing commas, which account for a large share of failures.
import re, json def repair_tool_args(args_str: str) -> dict: """Attempt to fix common JSON issues in tool call arguments.""" # Remove trailing commas before } or ] args_str = re.sub(r',\s*}', '}', args_str) args_str = re.sub(r',\s*]', ']', args_str) return json.loads(args_str) # Usage tool_call = response.choices[0].message.tool_calls[0] try: args = json.loads(tool_call.function.arguments) except json.JSONDecodeError: args = repair_tool_args(tool_call.function.arguments)
Fix 3: Proxy-level repair with StreamFix
StreamFix sits between your code and the model provider. It intercepts tool_calls in the response and repairs the arguments JSON before it reaches your application. Works on both streaming and non-streaming responses.
from openai import OpenAI import json client = OpenAI( base_url="https://streamfix.dev/v1", api_key="sk_YOUR_STREAMFIX_KEY", ) response = client.chat.completions.create( model="openai/gpt-4o-mini", tools=[{ "type": "function", "function": { "name": "search_products", "parameters": { "type": "object", "properties": { "query": {"type": "string"}, "filters": {"type": "object"}, }, "required": ["query"] } } }], messages=[{"role": "user", "content": "Find wireless headphones under $100"}], ) # tool_call.function.arguments is guaranteed parseable tool_call = response.choices[0].message.tool_calls[0] args = json.loads(tool_call.function.arguments) # ✅ always valid JSON
StreamFix repairs trailing commas, single-quoted strings, unquoted keys, truncated JSON, control characters, and other malformations. It works with any model — OpenAI, Llama, Mixtral, Qwen, DeepSeek — routed through OpenRouter or direct.
Stop tool calls from breaking your agent
When json.loads(tool_call.function.arguments) throws, your agent loop crashes. StreamFix repairs tool call arguments in-flight — one base_url change, zero code changes to your tool handling.
from openai import OpenAI client = OpenAI( base_url="https://streamfix.dev/v1", api_key="sk_YOUR_STREAMFIX_KEY", ) # Your existing tool-calling code works unchanged. # StreamFix ensures tool_call.function.arguments # is always valid, parseable JSON. response = client.chat.completions.create( model="openai/gpt-4o", tools=my_tools, messages=my_messages, ) for tc in response.choices[0].message.tool_calls: args = json.loads(tc.function.arguments) # ✅ guaranteed result = execute_tool(tc.function.name, args)