Reference Guide

Common LLM JSON Errors & Fixes

Complete reference guide with real examples from 672 production tests. Copy-paste solutions for every error type.


Quick stats: 95.5% of JSON failures are markdown fences. Trailing commas: 2.5%. The rest: 2%. Focus your fixes accordingly.

Jump to Error:

1. Markdown Code Fences

95.5% of all failures (183/192 in benchmark)

Error Example:

```json
{
  "name": "Alice",
  "age": 30
}
```

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Root Cause:

LLMs are trained on GitHub code with markdown formatting. When prompted for JSON, they often wrap it in code blocks out of habit.

Models Most Affected:

Fix: Regex Strip (Recommended - Free)

import re
import json

def strip_markdown(text):
    # Remove markdown code fences
    cleaned = re.sub(r'^```(?:json)?\n?', '', text)
    cleaned = re.sub(r'\n?```$', '', cleaned)
    return cleaned.strip()

response_text = """```json
{"name": "Alice"}
```"""

cleaned = strip_markdown(response_text)
data = json.loads(cleaned)  # ✓ Works

Or use a library: json-repair (4.5k stars) handles this + 20 other edge cases. Or a hosted proxy like StreamFix if you don't want to maintain code.

2. Trailing Commas

~2.5% of failures

Error Example:

{
  "name": "Alice",
  "age": 30,
}

JSONDecodeError: Expecting property name enclosed in double quotes

Root Cause:

Valid in JavaScript/Python, invalid in JSON spec. Models trained on code datasets carry this over.

Models Most Affected:

Fix Option 1: Regex (Python)

import re
import json

def remove_trailing_commas(text):
    # Remove commas before closing braces/brackets
    cleaned = re.sub(r',(\s*[}\]])', r'\1', text)
    return cleaned

response = '{"name": "Alice", "age": 30,}'
cleaned = remove_trailing_commas(response)
data = json.loads(cleaned)  # ✓ Works

Fix Option 2: json-repair Library

from json_repair import repair_json

response = '{"name": "Alice", "age": 30,}'
cleaned = repair_json(response)
data = json.loads(cleaned)  # ✓ Works

Fix Option 3: StreamFix (API-level)

# Automatic - no code changes needed
client = OpenAI(base_url="https://streamfix.up.railway.app/v1", ...)

3. Unquoted Keys

~1% of failures

Error Example:

{
  name: "Alice",
  age: 30
}

JSONDecodeError: Expecting property name enclosed in double quotes

Root Cause:

JavaScript object literal syntax. Valid in JS, invalid in JSON.

Models Most Affected:

Fix: Regex (Python)

import re
import json

def quote_keys(text):
    # Add quotes to unquoted keys
    cleaned = re.sub(r'(\w+):', r'"\1":', text)
    return cleaned

response = '{name: "Alice", age: 30}'
cleaned = quote_keys(response)
data = json.loads(cleaned)  # ✓ Works

4. Single Quotes

~0.5% of failures

Error Example:

{'name': 'Alice', 'age': 30}

JSONDecodeError: Expecting property name enclosed in double quotes

Root Cause:

Python uses single quotes by default. Models trained on Python code output Python-style strings.

Fix: String Replace (Python)

import json

response = "{'name': 'Alice', 'age': 30}"
cleaned = response.replace("'", '"')
data = json.loads(cleaned)  # ✓ Works

# Warning: Fails if values contain apostrophes
# Better: Use StreamFix or json-repair for robust handling

5. Incomplete JSON (Streaming)

Common in SSE streaming

Error Example:

{"name": "Alice", "age": 

JSONDecodeError: Expecting value: line 1 column 24 (char 23)

Root Cause:

Server-Sent Events (SSE) send partial chunks. If you try to parse mid-stream, JSON is incomplete.

Fix Option 1: Wait for Complete Stream

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "..."}],
    stream=True
)

# Accumulate chunks
full_content = ""
for chunk in response:
    if chunk.choices[0].delta.content:
        full_content += chunk.choices[0].delta.content

# Parse only when complete
data = json.loads(full_content)  # ✓ Works

Fix Option 2: Use Instructor Partial Mode

from instructor import Instructor, Partial
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = Instructor(OpenAI())

# Stream partial objects safely
for partial_user in client.chat.completions.create(
    response_model=Partial[User],  # Handles incomplete JSON
    messages=[...],
    stream=True
):
    print(partial_user)  # ✓ No errors

6. Extra Text Before/After JSON

~0.3% of failures

Error Example:

Here is the user data:
{"name": "Alice", "age": 30}
Hope that helps!

JSONDecodeError: Extra data after JSON

Root Cause:

Model adds conversational text despite "output JSON only" instructions.

Fix: Extract JSON Regex (Python)

import re
import json

def extract_json(text):
    # Find JSON object or array
    match = re.search(r'[{\[].*[}\]]', text, re.DOTALL)
    if match:
        return match.group(0)
    return text

response = """Here is the user data:
{"name": "Alice", "age": 30}
Hope that helps!"""

json_str = extract_json(response)
data = json.loads(json_str)  # ✓ Works

7. Missing Value Quotes

Rare but critical

Error Example:

{"name": Alice, "age": 30}

JSONDecodeError: Expecting value: line 1 column 10 (char 9)

Root Cause:

Model treats string values as identifiers (like true/false/null).

Fix: Requires Full Parser

# Regex too brittle for this case
# Use json-repair or StreamFix

from json_repair import repair_json

response = '{"name": Alice, "age": 30}'
cleaned = repair_json(response)
data = json.loads(cleaned)  # ✓ Works

8. Invalid Escape Sequences

Very rare

Error Example:

{"path": "C:\Users\Alice"}

JSONDecodeError: Invalid \escape

Root Cause:

Backslashes must be escaped: \\. Model outputs raw paths.

Fix: Escape Backslashes (Python)

import json

response = '{"path": "C:\\Users\\Alice"}'  # Note: raw string
# Or fix programmatically:
cleaned = response.replace('\\', '\\\\')
data = json.loads(cleaned)  # ✓ Works

Universal Solution: StreamFix

Instead of maintaining multiple regex fixes, use StreamFix to handle all cases automatically:

# Get API key (1000 free credits)
curl -X POST https://streamfix.up.railway.app/account/create?email=you@example.com

# Update your code (one line)
client = OpenAI(
    base_url="https://streamfix.up.railway.app/v1",
    api_key="YOUR_STREAMFIX_KEY"
)

# All errors above are now fixed automatically:
# ✓ Markdown fences removed
# ✓ Trailing commas stripped
# ✓ Keys quoted
# ✓ Single quotes converted
# ✓ Extra text removed
# ✓ Escape sequences fixed

# Success rate: 33.3% → 98.4% (verified across 672 tests)

Related Resources