Handling Broken JSON from LLMs: A Field Guide
Why does JSON.parse() fail so often with LLMs? Because Large Language Models are token predictors, not syntax engines. They "know" what JSON looks like, but they don't validate it as they generate it.
This guide covers **six common failures** and provides production-ready Python code to fix them yourself.
1. How to Fix JSONDecodeError from Trailing Commas
The Error: json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes
Why it happens: LLMs are trained on massive amounts of code, including JavaScript and Python (where trailing commas are often valid). However, the strict JSON standard (RFC 8259) forbids them. If an LLM generates a list and stops, it often leaves that final comma hanging.
{"items": [1, 2, 3,]}
Solution for Python
Don't use a simple global regex replace—it's dangerous and can corrupt valid strings containing commas. Use a tokenizer or a context-aware parser.
import re def fix_trailing_commas(json_str): # Remove comma before close array/object # Note: This is a simplified regex. For production, # use a proper lexer to avoid touching strings. regex = r',(?=\s*?[}\]])' return re.sub(regex, '', json_str) # Usage bad_json = '{"data": [1, 2,]}' fixed = fix_trailing_commas(bad_json) print(fixed) # {"data": [1, 2]}
2. Handling Truncated JSON Streams
The Error: json.decoder.JSONDecodeError: Expecting value: line 1 column ... (char ...)
Why it happens: Network timeouts, strict `max_tokens` limits, or "stop sequences" triggering early. The JSON stream just... stops.
{"users": [{"id": 1, "name": "Al
Solution: The Stack Method (Python)
You need to track open braces/brackets and close them in reverse order. This is difficult to do correctly if the cut-off happens inside a string literal.
def balance_json(json_str): stack = [] in_string = False escape = False for char in json_str: if escape: escape = False continue if char == '\\': # Corrected escape for backslash escape = True continue if char == '"': in_string = not in_string continue if in_string: continue if char == '{': stack.append('}') elif char == '[': stack.append(']') elif char in ['}', ']']: if stack: stack.pop() if in_string: json_str += '"' return json_str + "".join(reversed(stack))
3. Stripping DeepSeek R1 <think> Tags
The Issue: Reasoning models "think" before they speak. They output their chain-of-thought wrapped in XML-like tags (e.g., <think>...</think>) before outputting the final JSON.
If you parse the raw output, your parser chokes on the XML tags.
How to fix it
You must strip these tags before parsing. If you are streaming, this is tricky because the </think> tag might be split across two network chunks.
import re def strip_think_tags(text): # Remove complete think blocks pattern = r'<think>.*?</think>' cleaned = re.sub(pattern, '', text, flags=re.DOTALL) return cleaned.strip()
4. Fixing Unquoted Keys (JavaScript Style)
The Error: json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes
Why it happens: Smaller or coding-optimized models often output JavaScript Object Notation (the JS subset) instead of strict JSON. They might write {name: "John"} instead of {"name": "John"}.
{id: 1, active: true}
How to fix it (Python)
You can use the demjson library (slower) or a regex substitution to quote keys. Be careful not to quote keys inside strings.
def quote_keys(json_str): # Regex to find unquoted keys # Matches word characters followed by a colon return re.sub(r'([{,]\s*)([a-zA-Z0-9_]+?)\s*:', r'\1"\2":', json_str)
5. Markdown Code Fences
The Issue: Chat models love to be helpful. Even when you ask for "JSON only", they wrap it in Markdown code blocks (```json ... ```) or add conversational filler.
Here is the data:\n```json\n{"a": 1}\n```
How to fix it
Use a robust extractor that looks for the first valid JSON object or array in the text.
def extract_json_from_markdown(text): # Try to find ```json block match = re.search(r'```(?:json)?\s*(.*?)```', text, re.DOTALL) if match: return match.group(1).strip() # Fallback: Find first { or [ start = re.search(r'[\{\[]', text) if start: return text[start.start():] return text
6. Split Tokens (SSE Streaming)
The Issue: In streaming mode (Server-Sent Events), tokens do not map 1:1 to JSON structure. A single key like "description" might arrive as three separate chunks: "des", "crip", "tion".
If you try to parse chunks individually or naïvely check for "valid JSON" at every chunk, you will fail 99% of the time.
The Solution: Buffering
You must maintain a buffer state. Only attempt to parse when you detect a potential completion boundary, or use an incremental parser.
StreamFix User? You don't need to implement this. Our engine buffers tokens (approx. 10 chars) automatically to handle split tags and keywords without latency.
More Guides
Or... don't maintain any of this code.
Writing regex parsers and state machines is not your core business logic. It's infrastructure plumbing.
StreamFix is a specialized proxy that sits between your code and the LLM. It runs a highly optimized Finite State Machine (FSM) in C/Python to handle all these edge cases (plus streaming split-tokens, markdown fences, and schema validation) in real-time.