Handling Messy LLM JSON: How I Use Pydantic to Fix Broken AI Outputs

Your LLM is Lying About Its JSON. Here’s How I Force it to Tell the Truth.

There is nothing quite as frustrating as watching your beautifully crafted agentic workflow crash because GPT-4 decided to wrap its JSON response in conversational prose. You asked for a schema; it gave you: “Sure! Here is the data you requested: json {...} Hope this helps!” Or worse, it misses a closing brace, hallucinates a key name, or returns a string where you explicitly demanded an integer. If you’ve spent any time building production-grade AI apps, you know that LLM outputs are inherently unstable. We can’t treat them like predictable REST APIs.

I’ve wasted countless hours writing brittle regex patterns to “find the JSON” inside a string, only to have the next model update break everything. That’s why I moved to a schema enforcement strategy using Pydantic.


The Gatekeeper: Why Pydantic Trumps json.loads

When we receive a raw string from an LLM, a simple json.loads() is a gamble. It only tells you if the string is syntactically valid JSON. It doesn’t tell you if user_id is actually a UUID or if the price field is a positive float.

Pydantic acts as the gatekeeper. By defining a BaseModel, we move from “string-wrangling” to actual type safety. If the LLM sends garbage, Pydantic catches it at the door before it can poison your database or crash your frontend.


The Workflow: From Chaos to Clean Objects

My typical pipeline involves three stages: extraction, validation, and error handling.

1. Defining the Schema

Instead of hoping the LLM follows your prompt, you define a strictly typed class. This serves as your “source of truth.”

2. Validating the Output

We use model_validate_json() to attempt the parse. This is far more robust than manual dictionary checking because it handles type coercion (e.g., turning the string “10” into the integer 10) automatically.

3. Graceful Recovery

When a ValidationError occurs, I don’t just let the app die. I catch it, log exactly which field failed, and trigger a retry or a fallback.


The Code: Cleaning the Mess

Here is a pattern I use to handle those “chatty” LLM responses that include markdown blocks and extra text.

Python

import json
import re
from pydantic import BaseModel, Field, ValidationError
from typing import List

# 1. Define your structure
class UserAction(BaseModel):
    action_type: str = Field(description="The primary action performed")
    confidence_score: float = Field(ge=0, le=1)
    tags: List[str]

def clean_and_parse_llm_json(raw_response: str):
    # My "In-the-trenches" tip: Use a regex to find the FIRST json block 
    # to skip the "Sure! Here's the JSON" prose.
    json_match = re.search(r"\{.*\}", raw_response, re.DOTALL)
    
    if not json_match:
        raise ValueError("No JSON found in response")
    
    json_str = json_match.group(0)

    try:
        # 2. Validation and Coercion
        validated_data = UserAction.model_validate_json(json_str)
        return validated_data
    except ValidationError as e:
        print(f"Validation failed: {e.json()}")
        # This is where I'd trigger a retry logic
        return None

# Example of a "messy" LLM response
messy_output = """
I have analyzed the logs. 
{
    "action_type": "LOGIN_ATTEMPT",
    "confidence_score": "0.95", 
    "tags": ["security", "auth"]
}
Note: The user was on a mobile device.
"""

parsed_data = clean_and_parse_llm_json(messy_output)
if parsed_data:
    print(f"Clean Action: {parsed_data.action_type}") # LOGIN_ATTEMPT

EEAT Insight: The “Retry” Pattern and Instructor

If you’re still manually writing regex and Pydantic boilerplate, I have a tip from the field: Check out the Instructor library. I’ve started using it almost exclusively because it patches the OpenAI/Anthropic client to return Pydantic objects directly.

More importantly, you need retry logic. When Pydantic raises a ValidationError, don’t give up. I often send the error message back to the LLM:

“Hey, you sent me invalid JSON. Here is the error: [Pydantic Error]. Please fix it.”

Usually, the model fixes it on the second try. This significantly increases the robustness of your production loops.


Why This Matters for Your App

Using Pydantic isn’t just about clean code; it’s a security and stability feature. By enforcing a schema at the edge of your AI logic, you prevent downstream crashes. Your business logic can assume the data is correct because Pydantic already verified it. It’s the difference between a “prototype” and a “product.”


JSON Sanitization Checklist

Use this before you deploy your next LLM-powered feature:

  • [ ] Is Pydantic defining the schema? (Avoid raw dicts).
  • [ ] Are you using Field constraints? (e.g., ge=0 for numbers).
  • [ ] Do you have a regex extractor? (To strip prose before/after).
  • [ ] Is there a try/except block for ValidationError?
  • [ ] Did you implement at least one “retry” on failure?