Reliability Layer for LLM Agents

Guaranteed JSON From Any LLM, Every Time

Repair syntax, coerce types, validate against your schema — automatically, mid-stream. One base_url change for agents, pipelines, and streaming UIs across every provider.

LLM output json.loads() fails
```json
{"status": "shipped",
 "count": "3",
 "active": True,
 "items": ['a','b','c',],}
```
fence wrong type Python literal single quotes trailing commas
StreamFix output schema valid
{"status": "shipped",
 "count": 3,
 "active": true,
 "items": ["a", "b", "c"]}
X-StreamFix-Applied: fence_strip, fix_single_quotes, fix_python_literals, remove_trailing_comma, type_coerce
X-StreamFix-Schema-Valid: true
33% → 99.5% strict parse • 336 tests • 8 models Benchmark

From Broken Output to Guaranteed Schema

Repair, coerce, validate, retry — all before your code sees the response.

Streaming Repair

JSON is fixed as it streams. Fences, <think> tags, trailing commas — stripped token-by-token via SSE. Auto-closes truncated streams.

Syntax + Literal Repair

Trailing commas, unquoted keys, single quotes, Python True/None, leading zeros — all fixed with sub-ms overhead.

Type Coercion

LLM returns "age": "30"? Auto-cast to 30 using your schema. Handles string-to-int, string-to-bool, float-to-int across nested structures.

Contract Mode

Pass a JSON Schema — get guaranteed conformance. Validates required fields, types, enums, and min/max constraints. Auto-retries with schema-aware prompts on failure.

Tool-Call Repair

OpenAI warns tool args aren't guaranteed valid JSON. We repair function.arguments so your agent framework doesn't break mid-chain.

Repair Provenance

Every response includes X-StreamFix-Applied headers listing exactly which repairs ran. Build alerts and dashboards on stable repair names.

Zero Data Retention

Passthrough proxy. Content processed in memory and immediately discarded. Never logged or trained on.

One Line to Guaranteed JSON

Change base_url. Everything else stays the same.

  • OpenAI SDK compatible
  • Multi-model via OpenRouter
from openai import OpenAI

# Just point to our gateway
client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_KEY"
)

# Works with any model
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[...]
)

FAQ

Why not just use response_format or structured output?

If you control a single provider with reliable structured output: you should.

The benchmark tested the "plain prompt baseline" because many production setups involve:

  • Multi-provider routing (OpenRouter, fallback chains)
  • Models where structured output isn't available or behaves inconsistently
  • Extra latency from constrained decoding
  • Wrappers that break consumers even with structured mode enabled

StreamFix is for the "messy middle" where you can't guarantee consistent structured output across all your providers.

"Streaming JSON makes no sense" — How does this actually work?

Correct: a full JSON array isn't valid until the closing ].

But that's not what we're doing. We perform incremental object extraction for UI rendering:

// Model streams:
[{"id":1,"name":"A"}, {"id":2,"name":"B"}, ...

// We don't parse the whole array
// Instead we:
1. Maintain a rolling buffer
2. Detect when an object is complete (brace-balanced ...})
3. Parse that single object
4. Render it immediately

So the UI shows Item #1 while Item #10 is still generating. This improves perceived latency without waiting for the final ].

What did the benchmark actually show?

Across 672 API calls with plain prompts (8 models, 7 tasks, temperature=0):

  • Strict json.loads(content) worked only 33.3% of the time
  • But 99.5% of responses contained valid JSON that was merely wrapped in fences/prose/think tags
  • 95.5% of failures were markdown fences — not logic errors

A simple cleanup layer increased strict parse success to 98.4% without changing prompts.

→ Read the full benchmark study

Who is StreamFix for?

High fit:

  • Agent pipelines (CrewAI, LangGraph, AutoGen, n8n) where one malformed output kills the chain
  • Multi-provider routing (OpenRouter, fallback chains) with inconsistent structured output support
  • Tool-calling agents where function.arguments can be malformed
  • Streaming UIs wanting incremental rendering
  • Contract Mode: need guaranteed schema conformance with type coercion and auto-retry

Low fit:

  • Local llama.cpp with grammar constraints
  • Single provider with reliable structured outputs

Pricing

Beta
Free

While StreamFix is in beta, all features are free.
No credit card. No commitment.

  • 1,000 free requests on signup (1 credit = 1 request)
  • Full API + streaming + tool-call repair
  • Contract Mode: schema validation + type coercion + auto-retry
  • Provenance headers + repair diagnostics
  • Share feedback → get 1,000 more credits
Get Free Key →

Running low? Reply to your welcome email with what you're building and what's broken — we'll top you up.

Paid plans after beta. Early users get notified first.

json.decoder.JSONDecodeError: Expecting value: line 1 column 1
SyntaxError: Unexpected token } in JSON at position...
json.decoder.JSONDecodeError: Extra data: line 1 column...
Unterminated string starting at: line 1 column...
Unexpected non-whitespace character after JSON data
DeepSeek R1 <think> tag parsing error