Fix CrewAI JSON Parse Errors
CrewAI agents return JSON in message.content, but the underlying model often wraps it in markdown fences, returns Python literals like True/None, or produces values that break Pydantic validation. Here's how to fix each issue.
Issue 1: Markdown fences in agent output
When you ask a CrewAI agent to return structured data, the underlying model almost always wraps the JSON in ```json fences. CrewAI's result.raw gives you the full model output including those fences, and json.loads() chokes on the backticks.
from crewai import Agent, Task, Crew import json agent = Agent( role="Data Analyst", goal="Return structured user data as JSON", backstory="You are a data analyst that always responds in JSON.", ) task = Task( description="Return user data as JSON with name and age fields", agent=agent, expected_output="JSON object with name and age", ) crew = Crew(agents=[agent], tasks=[task]) result = crew.kickoff() data = json.loads(result.raw) # json.decoder.JSONDecodeError: Expecting value: line 1, column 1 (char 0)
```json { "name": "Alice Johnson", "age": 32 } ```
import re, json def strip_fences(text: str) -> str: """Remove ```json ... ``` wrappers from LLM output.""" match = re.search(r'```(?:json)?\s*([\s\S]*?)```', text) if match: return match.group(1).strip() return text.strip() result = crew.kickoff() cleaned = strip_fences(result.raw) data = json.loads(cleaned) # ✅ {"name": "Alice Johnson", "age": 32}
Issue 2: Python literals (True / False / None)
Many models — especially code-tuned ones like CodeLlama and DeepSeek Coder — return Python-style booleans and nulls instead of JSON-standard true/false/null. This is especially common when your CrewAI agent's prompt mentions Python or data analysis.
{
"username": "bob_smith",
"active": True,
"verified": False,
"deleted_at": None
}
import json json.loads(result.raw) # json.decoder.JSONDecodeError: Expecting value: line 3, column 13 (char 39) # (because "True" is not valid JSON — it must be "true")
import json def fix_python_literals(text: str) -> str: """Convert Python True/False/None to JSON true/false/null.""" text = text.replace(": True", ": true") text = text.replace(": False", ": false") text = text.replace(": None", ": null") # Also handle values in arrays: [True, False, None] text = text.replace("[True", "[true").replace("[False", "[false").replace("[None", "[null") text = text.replace(", True", ", true").replace(", False", ", false").replace(", None", ", null") return text result = crew.kickoff() cleaned = fix_python_literals(result.raw) data = json.loads(cleaned) # ✅ {"username": "bob_smith", "active": true, ...}
ast.literal_eval()? It works for simple cases but fails on nested JSON with null values and doesn't handle mixed Python/JSON output. The string replacement approach is more reliable for LLM output. See our full guide on Python literal fixes.
Issue 3: Type mismatches breaking Pydantic validation
CrewAI supports Pydantic models for structured output via the output_pydantic parameter on tasks. But models frequently return strings where integers are expected, or integers where strings are expected. The JSON is valid — it just doesn't match the schema.
from crewai import Agent, Task, Crew from pydantic import BaseModel class UserProfile(BaseModel): name: str age: int score: float task = Task( description="Return the user profile for Alice", agent=agent, expected_output="UserProfile JSON", output_pydantic=UserProfile, ) crew = Crew(agents=[agent], tasks=[task]) result = crew.kickoff() # pydantic_core._pydantic_core.ValidationError: 1 validation error for UserProfile # age # Input should be a valid integer, got string '25' [type=int_parsing, ...]
# The JSON is valid, but age is a string instead of int: { "name": "Alice", "age": "25", "score": "98.5" }
from pydantic import BaseModel, field_validator class UserProfile(BaseModel): name: str age: int score: float @field_validator("age", mode="before") @classmethod def coerce_age(cls, v): if isinstance(v, str): return int(v) return v @field_validator("score", mode="before") @classmethod def coerce_score(cls, v): if isinstance(v, str): return float(v) return v
For models with many numeric fields, you can use a generic pre-validator that coerces all string-encoded numbers automatically:
from pydantic import BaseModel, model_validator class LLMBaseModel(BaseModel): """Base model that coerces string-encoded numbers from LLM output.""" @model_validator(mode="before") @classmethod def coerce_numeric_strings(cls, data): if not isinstance(data, dict): return data for field_name, field_info in cls.model_fields.items(): if field_name in data and isinstance(data[field_name], str): if field_info.annotation is int: try: data[field_name] = int(data[field_name]) except ValueError: pass elif field_info.annotation is float: try: data[field_name] = float(data[field_name]) except ValueError: pass return data class UserProfile(LLMBaseModel): name: str age: int score: float # Now works even when the model returns {"age": "25", "score": "98.5"} profile = UserProfile.model_validate_json(cleaned_json) # ✅
output_pydantic calls model_validate_json() internally. If you use the LLMBaseModel above as your base class, CrewAI will coerce types automatically without any extra parsing code.
Combining all fixes
In practice you need to handle fences, Python literals, and type coercion together. Here's a complete parsing pipeline for CrewAI agent output:
import re, json def parse_crewai_json(raw: str) -> dict: """Parse JSON from CrewAI agent output, handling common LLM quirks.""" # 1. Strip markdown fences fence = re.search(r'```(?:json)?\s*([\s\S]*?)```', raw) text = fence.group(1).strip() if fence else raw.strip() # 2. Fix Python literals text = text.replace(": True", ": true").replace(": False", ": false").replace(": None", ": null") text = text.replace(", True", ", true").replace(", False", ", false").replace(", None", ", null") # 3. Remove trailing commas (another common LLM mistake) text = re.sub(r',\s*([}\]])', r'\1', text) return json.loads(text) result = crew.kickoff() data = parse_crewai_json(result.raw) # ✅ handles all three issues
Handle all CrewAI JSON issues automatically
StreamFix sits between CrewAI and the LLM provider. It strips fences, fixes Python literals, coerces types to match your schema, and removes trailing commas — all before the response reaches your agent. One base_url change, no parsing code needed.
from crewai import Agent, Task, Crew, LLM llm = LLM( model="openai/gpt-4o-mini", base_url="https://streamfix.dev/v1", api_key="sk_YOUR_STREAMFIX_KEY", ) agent = Agent( role="Data Analyst", goal="Return structured user data as JSON", llm=llm, backstory="You are a data analyst that always responds in JSON.", ) task = Task( description="Return the user profile for Alice", agent=agent, expected_output="JSON object with name, age, and score", output_pydantic=UserProfile, ) crew = Crew(agents=[agent], tasks=[task]) result = crew.kickoff() # Fences stripped, Python literals fixed, types coerced — automatically print(result.pydantic) # ✅ UserProfile(name='Alice', age=32, score=98.5)