API Reference

StreamFix is a drop-in replacement for the OpenAI API. It proxies your requests to OpenRouter (or your own key), repairing broken JSON and validating schemas in real-time.

Recommended Setup (Python)
from openai import OpenAI

client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_KEY"
)

# Use normally - works with any model
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

Authentication

Authenticate requests via the Authorization header using your Bearer token.

Authorization: Bearer sk_YOUR_API_KEY

Note: You can also Bring Your Own Key (BYOK) for upstream providers using the X-Provider-Authorization header.

POST

/v1/chat/completions

Standard OpenAI-compatible chat completion endpoint. Supports JSON mode, streaming, and function calling. All standard OpenAI parameters (temperature, max_tokens, top_p, etc.) are passed through to the upstream provider.

Parameters

Param Type Required Description
model string Yes Target model (e.g. openai/gpt-4o)
messages array Yes Chat history
stream boolean No Enable streaming response
tools array No Tool definitions for function calling
schema object No JSON Schema for Contract Mode (3 credits). Use extra_body wrapper for OpenAI SDK.

Streaming

StreamFix supports Server-Sent Events (SSE) with token-by-token repair. The FSM engine buffers only the minimum tokens needed (approximately 10 characters) to apply repairs without delaying the stream.

Real-Time Repair: Unquoted keys, trailing commas, and <think> tags are fixed/stripped on-the-fly. The client receives valid JSON chunks with sub-millisecond overhead.

Example

from openai import OpenAI

client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_KEY"
)

stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "List 3 items as JSON"}],
    stream=True
)

# Chunks arrive repaired in real-time
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Repair Pipeline

Every response passes through a multi-stage pipeline. Each stage is independent and applies only when needed.

Extract (FSM) Repair (syntax) Coerce (types) Validate (schema) Retry

Stage 1: Extraction

Isolates JSON from surrounding prose, markdown fences, and reasoning tags.

Repair NameDescription
fence_stripRemoves markdown code fences (```json ... ```)
think_tag_stripStrips <think>...</think> reasoning tags (DeepSeek, etc.)
prose_extractExtracts JSON object/array from surrounding prose text

Stage 2: Syntax Repair

Fixes common JSON syntax violations produced by LLMs.

Repair NameDescription
remove_trailing_commaRemoves trailing commas before } or ]
quote_unquoted_keysWraps unquoted object keys in double quotes
fix_single_quotesConverts single-quoted strings to double-quoted
close_truncated_jsonCloses unclosed braces/brackets from truncated output
fix_python_literalsConverts Python True/False/None to JSON equivalents
fix_leading_zerosRemoves invalid leading zeros from numbers (e.g. 007 to 7)
insert_null_for_empty_valuesReplaces empty values with null (e.g. {"key": })

Stage 3: Contract

Applied when a schema is provided (Contract Mode).

Repair NameDescription
type_coerceCoerces values to match schema types (see Type Coercion)

Schema Validation (Contract Mode)

Enforce strict adherence to a JSON Schema. When a schema is provided, StreamFix runs the full pipeline: repair the syntax, coerce types to match, validate against the schema, and auto-retry with a schema-aware prompt if validation fails.

Auto-Retry: On validation failure, StreamFix re-prompts the model with a schema-aware system message that includes enum values, min/max constraints, and required fields. Maximum 1 retry per request.

resp = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Extract the order info"}],
    extra_body={
        "schema": {
            "type": "object",
            "required": ["order_id", "total", "status"],
            "properties": {
                "order_id": {"type": "integer"},
                "total": {"type": "number"},
                "status": {"type": "string", "enum": ["pending", "shipped", "delivered"]}
            }
        }
    }
)

Limitations

  • Streaming requests receive repair and coercion but no auto-retry (the response is already sent).
  • Maximum 1 retry per request.
  • 3 credits are charged regardless of whether a retry occurs.
Cost: 3 Credits Repair + Coerce + Validate + Retry

Strict Mode

Strict Mode guarantees that a response is parseable JSON or returns a structured error. Enable it by setting the X-StreamFix-Strict header.

X-StreamFix-Strict: true

Behavior

  • Returns 200 when the response is valid JSON after repair.
  • Returns 422 when the JSON could not be parsed even after all repair stages.
  • The 422 response body includes a typed error object with details about the parse failure.
  • Not compatible with streaming -- returns 400 if stream: true is set alongside Strict Mode.
  • Can be combined with Contract Mode for maximum guarantees (schema validation + parse-or-fail).

Example

from openai import OpenAI

client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_KEY",
    default_headers={"X-StreamFix-Strict": "true"}
)

# 200 = valid JSON, 422 = parse failure
resp = client.with_raw_response.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Return a JSON object"}]
)

print(resp.status_code)  # 200 or 422

422 Response Body

{
  "error": {
    "type": "parse_failure",
    "message": "Failed to extract or repair valid JSON from response",
    "request_id": "req_abc123def456",
    "extraction_status": "FAILED",
    "raw_content_preview": "Sure! Here is some info..."
  }
}

Type Coercion

When a schema is provided, StreamFix automatically coerces values to match the declared types before validation. This fixes the most common schema violations from LLMs without requiring a retry.

Input Schema Type Output
"30" integer 30
"true" boolean true
30.0 integer 30
"3.14" number 3.14

Note: Coercion is recorded as type_coerce in the x-streamfix-applied response header, so you can track how often your chosen model returns the wrong type.

GET

/result/{request_id}

Retrieve the repaired JSON and validation metadata for any request (streaming or non-streaming).

{
  "request_id": "req_abc123",
  "status": "REPAIRED",
  "repairs_applied": ["remove_trailing_comma"],
  "repaired_content": "{\"status\": \"ok\"}",
  "schema_valid": true,
  "response_time_ms": 450
}

Tool Calls Support

StreamFix automatically repairs broken JSON in tool_calls[].function.arguments for non-streaming requests.

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Call test_function"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "test_function",
            "parameters": {
                "type": "object",
                "properties": {"arg": {"type": "string"}}
            }
        }
    }]
)

# Tool call arguments are automatically repaired if broken
tool_calls = response.choices[0].message.tool_calls
if tool_calls:
    args = tool_calls[0].function.arguments  # Already repaired JSON

Note: For streaming requests, tool call arguments are passed through as-is to ensure low latency. Use non-streaming for guaranteed tool call repair.

Supported Models

We support all models via OpenRouter. Here are our recommended defaults for reliability.

openai/gpt-4o-mini

Best balance of speed and cost ($0.000005/req)

anthropic/claude-3.5-haiku

Excellent for structured data extraction

qwen/qwen-2.5-72b-instruct

High performance open-source model

Response Headers

StreamFix returns all repair and validation metadata in HTTP headers to maintain 100% OpenAI API compatibility in the response body.

Always Present

x-streamfix-request-id: req_abc123def456
x-streamfix-status: repaired
x-streamfix-applied: fence_strip,remove_trailing_comma
x-streamfix-credits-used: 1
x-streamfix-credits-remaining: 999
x-streamfix-mode: shared

Contract Mode (When Schema Provided)

x-streamfix-contract-mode: active
x-streamfix-schema-valid: true
x-streamfix-schema-errors: 0
x-streamfix-retry-count: 1  # If retry was triggered

Tool Calls (When Arguments Repaired)

x-streamfix-tool-args-repaired: 1  # Non-streaming only

Header Reference

Header Values Description
x-streamfix-status pass | repaired | failed Overall repair status for this request
x-streamfix-applied comma-separated names Repair names applied (e.g. fence_strip,remove_trailing_comma)
x-streamfix-repair-status applied | none | passthrough Whether repairs were applied to the response
x-streamfix-repairs-applied 0-N Number of repairs made
x-streamfix-contract-mode active Present when Contract Mode is enabled
x-streamfix-schema-valid true | false Schema validation result (Contract Mode only)
x-streamfix-artifact-stored true | false Whether repair artifact was saved (requires opt-in)
x-streamfix-tool-args-repaired 0-N Number of tool call arguments repaired (non-streaming only)
x-streamfix-retry-count 0-1 Number of retries attempted (Contract Mode only)
x-streamfix-client-request-id string Echoed correlation ID from X-Request-Id header

Repair Taxonomy

The x-streamfix-applied header contains a comma-separated list of repair names. Use this to monitor which models produce which errors.

Stage Name Example
Extractionfence_strip```json{...}``` → {...}
Extractionthink_tag_strip<think>...</think>{...} → {...}
Extractionprose_extractHere is the JSON: {...} → {...}
Syntaxremove_trailing_comma{"a":1,} → {"a":1}
Syntaxquote_unquoted_keys{key:"v"} → {"key":"v"}
Syntaxfix_single_quotes{'a':'b'} → {"a":"b"}
Syntaxclose_truncated_json{"a":1 → {"a":1}
Syntaxfix_python_literals{"ok":True} → {"ok":true}
Syntaxfix_leading_zeros{"n":007} → {"n":7}
Syntaxinsert_null_for_empty_values{"a":} → {"a":null}
Contracttype_coerce"30" → 30 (integer)

Error Codes

Code Meaning Solution
401 Unauthorized Check your API Key in the Authorization header.
402 Payment Required Insufficient credits. Top up at /account/purchase.
422 Unprocessable Entity Strict Mode: JSON could not be parsed after all repair stages.
429 Rate Limited You exceeded 60 requests/minute. Slow down.
408 Request Timeout Streaming timeout exceeded (300 seconds).
413 Payload Too Large Request exceeds 10MB limit.
502 Bad Gateway Upstream provider (e.g. OpenAI) error. Retry.
POST

/account/create

Create a new account and receive an API key with 1000 free credits.

Request

POST /account/create?email=user@example.com

Response

{
  "api_key": "sk_abc123...",
  "email": "user@example.com",
  "credits": 1000,
  "message": "Account created! Save your API key - it won't be shown again."
}

Note: If the email already exists, the API key is rotated and existing credits are preserved.

GET

/account/balance

Check your remaining credits. Requires authentication.

Request

GET /account/balance
Authorization: Bearer sk_YOUR_API_KEY

Response

{
  "credits_remaining": 997,
  "is_active": true
}
POST

/account/purchase-by-email

Create a Stripe checkout session for purchasing credits. User-friendly: uses email instead of API key.

Request

POST /account/purchase-by-email
Content-Type: application/json

{
  "email": "user@example.com"
}

Response

{
  "checkout_url": "https://checkout.stripe.com/...",
  "credits": 10000,
  "price_usd": 10.0
}
$10 = 10,000 credits $0.001 per credit

BYOK (Bring Your Own Key)

Use your own provider API keys to avoid using the shared pool.

from openai import OpenAI

client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_STREAMFIX_KEY",
    default_headers={
        "X-Provider-Authorization": "Bearer YOUR_OPENROUTER_KEY"
    }
)

Benefits: Use your own OpenRouter account, avoid shared pool rate limits, more control over costs. StreamFix credits are still deducted (1 per request, 3 for Contract Mode), but upstream costs use your account.

Pricing

Item Cost
Basic request (repair only) 1 credit
Contract Mode (schema validation + retry) 3 credits
Free tier on signup 1,000 credits
Credit purchase $10 = 10,000 credits

Best Practices

  • Use Contract Mode for agent pipelines: Agents that parse JSON tool outputs or structured data benefit from schema validation and auto-retry -- not just streaming UIs.
  • Combine Strict Mode + Contract Mode: For maximum guarantees, enable both. You get schema validation with retry and a hard 422 if the output still cannot be parsed.
  • Monitor X-StreamFix-Applied: This header tells you exactly which repairs were needed. Use it to understand which models produce bad output and to track repair rates over time.
  • Use BYOK to control upstream costs: Pass your own OpenRouter key via X-Provider-Authorization to separate upstream LLM costs from StreamFix credits.
  • Non-streaming for tool calls, streaming for UIs: Tool call argument repair only works in non-streaming mode. Use streaming for user-facing chat interfaces where latency matters.
  • Always use JSON Mode: Set response_format: {"type": "json_object"} when possible. StreamFix repairs what breaks, but starting with valid intent helps.
  • Handle 429s: Implement exponential backoff for rate limits (60 requests/minute).
  • Use Correlation IDs: Send X-Request-Id or X-Client-Request-Id header to track requests across your system logs.