Fix DeepSeek JSON Output Issues
DeepSeek R1 leaks <think> reasoning tags into your JSON responses. DeepSeek V3 wraps JSON in markdown fences. Both break json.loads().
Issue 1: <think> tag leakage
DeepSeek R1 is a chain-of-thought model. It "thinks" before answering, and this reasoning is supposed to be in a separate field. But with many API providers and prompt styles, the <think> block bleeds into the content field you parse.
<think>
The user wants customer data as JSON. I should return
name and email fields. Let me format this correctly...
</think>
{"name": "Alice", "email": "alice@example.com"}
content = resp.choices[0].message.content
json.loads(content)
# JSONDecodeError: Expecting value: line 1 column 1 (char 0)
# (because content starts with "<think>", not "{")
import re, json def strip_think_tags(text: str) -> str: """Remove <think>...</think> blocks from DeepSeek R1 output.""" return re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL).strip() content = resp.choices[0].message.content cleaned = strip_think_tags(content) data = json.loads(cleaned) # ✅ {"name": "Alice", "email": "alice@example.com"}
<think> blocks arrive token-by-token before the JSON starts. You need to buffer and detect the closing </think> tag before beginning JSON assembly. StreamFix handles this automatically.
Issue 2: Markdown fence wrapping
DeepSeek V3 (and R1 in non-streaming mode) frequently wraps JSON in markdown code blocks even when explicitly asked not to. This is the single most common JSON failure across all models — our benchmark found 95.5% of failures are fences.
```json { "order_id": "ORD-1234", "status": "shipped", "items": [{"sku": "A1", "qty": 2}] } ```
import re, json def parse_llm_json(text: str) -> dict: """Parse JSON from LLM output — strips fences and think tags.""" # 1. Strip <think> blocks (DeepSeek R1) text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL) # 2. Extract content from fenced blocks if present fence_match = re.search(r'```(?:json)?\s*([\s\S]*?)```', text) if fence_match: text = fence_match.group(1) # 3. Find first JSON object or array text = text.strip() return json.loads(text) content = resp.choices[0].message.content data = parse_llm_json(content) # ✅ works on all DeepSeek models
Issue 3: Streaming with DeepSeek R1
Streaming DeepSeek R1 via OpenRouter adds another layer: the <think> content arrives as regular content delta tokens before the JSON. If you try to parse mid-stream you'll hit errors on the reasoning text.
from openai import OpenAI import json, re client = OpenAI(api_key="...", base_url="https://openrouter.ai/api/v1") stream = client.chat.completions.create( model="deepseek/deepseek-r1", messages=[{"role": "user", "content": "Return JSON: {name, score}"}], stream=True, ) buffer = "" in_think = False json_started = False json_chunks = [] for chunk in stream: delta = chunk.choices[0].delta.content or "" buffer += delta # Skip <think> blocks if "<think>" in buffer: in_think = True if "</think>" in buffer: in_think = False; buffer = buffer.split("</think>")[-1] if in_think: continue # Start collecting once JSON begins if not json_started and ('{' in buffer or '[' in buffer): json_started = True if json_started: json_chunks.append(delta) raw_json = "".join(json_chunks).strip() raw_json = re.sub(r'```(?:json)?|```', '', raw_json).strip() data = json.loads(raw_json) # ✅
DeepSeek model behavior reference
| Model | <think> leakage | Fence wrapping | Streaming issues |
|---|---|---|---|
| deepseek/deepseek-r1 | High | Medium | High |
| deepseek/deepseek-r1-distill-llama-70b | High | Medium | Medium |
| deepseek/deepseek-chat (V3) | None | High | Medium |
| deepseek/deepseek-coder-v2 | None | Medium | Low |
Based on plain-prompt testing (no response_format or structured output params). Results vary by provider.
Handle all DeepSeek quirks automatically
StreamFix strips <think> tags and markdown fences in real-time during streaming. Works with any DeepSeek model via OpenRouter — one base_url change.
from openai import OpenAI client = OpenAI( api_key="sk_YOUR_STREAMFIX_KEY", base_url="https://streamfix.up.railway.app/v1", ) # <think> tags and fences stripped automatically resp = client.chat.completions.create( model="deepseek/deepseek-r1", messages=[{"role": "user", "content": "Return JSON: {name, score}"}], ) data = json.loads(resp.choices[0].message.content) # ✅ always works