Skip to main content

Lesson 5: Pydantic for AI-Native Development

Introduction: The AI Trust Problem

AI is powerful, but it's probabilistic, not deterministic. When you ask Claude Code or another LLM to generate JSON, you get a response that looks right but might have subtle issues: a string where you expected an integer, a missing field, an unexpected extra field. These errors don't fail silently—they crash your production system or corrupt your data.

Here's the harsh reality: Never trust AI output without validation.

This is where Pydantic becomes your safety net. While Chapters 1-4 showed you how to define data structures with Pydantic, this lesson shows you why validation is critical in AI systems and how to build the iterative loop that makes AI-native development reliable: describe your intent → generate output → validate it → if it fails, improve your prompt and try again.

This lesson teaches you to think like an AI-native engineer: validation isn't optional error handling; it's the core of how you work with unpredictable AI systems.


Section 1: Validating LLM Outputs

When you ask Claude Code to generate structured data (a recipe, a user profile, configuration), it returns JSON as text. Your job is to parse that text and validate it against your Pydantic model.

The Validation Workflow

Let's say you want Claude Code to generate a recipe:

from pydantic import BaseModel, ValidationError
from typing import Annotated

class Recipe(BaseModel):
name: str
ingredients: list[str]
steps: list[str]
prep_time_minutes: int # Must be an integer (minutes)

You ask Claude Code: "Generate a recipe for chocolate chip cookies as JSON." Claude responds with something like:

{
"name": "Chocolate Chip Cookies",
"ingredients": ["2 cups flour", "1 cup sugar", "2 eggs", "chocolate chips"],
"steps": ["Mix ingredients", "Bake at 350F for 12 minutes"],
"prep_time_minutes": 30
}

Now comes the validation:

llm_response = '''
{
"name": "Chocolate Chip Cookies",
"ingredients": ["2 cups flour", "1 cup sugar", "2 eggs", "chocolate chips"],
"steps": ["Mix ingredients", "Bake at 350F for 12 minutes"],
"prep_time_minutes": 30
}
'''

try:
recipe: Recipe = Recipe.model_validate_json(llm_response)
print(f"✓ Success! Recipe validated: {recipe.name}")
print(f" Prep time: {recipe.prep_time_minutes} minutes")
except ValidationError as e:
print("✗ Validation failed:")
for error in e.errors():
print(f" Field: {error['loc'][0]}")
print(f" Error: {error['msg']}")

Key method: model_validate_json() parses JSON directly from a string and validates it in one step. This is faster and cleaner than parsing with json.loads() then calling Recipe(**data).

✨ Teaching Tip

Use model_validate_json() whenever you have JSON strings from AI or APIs. It combines parsing and validation, and Pydantic's error messages tell you exactly what went wrong.


Handling Validation Errors

Let's see what happens when Claude generates something invalid:

# LLM sometimes generates this (mixing string and int for time)
bad_response = '''
{
"name": "Cookies",
"ingredients": ["flour", "sugar"],
"steps": ["Mix", "Bake"],
"prep_time_minutes": "30 minutes" # Wrong! String instead of int
}
'''

try:
recipe = Recipe.model_validate_json(bad_response)
except ValidationError as e:
print("Validation Error Details:")
for error in e.errors():
location: str = str(error['loc'][0])
message: str = error['msg']
print(f" {location}: {message}")

Output:

Validation Error Details:
prep_time_minutes: Input should be a valid integer [type=int_parsing, input_value='30 minutes', input_type=str]

The error tells you exactly what's wrong: Pydantic expected an integer but got a string. This is actionable feedback—you can now improve your prompt to guide the LLM.

💬 AI Colearning Prompt

Ask your AI: "I'm validating LLM outputs with Pydantic. When validation fails with 'Input should be a valid integer', what does that mean? Show me an example of how to improve the prompt to fix it."

This shows you that error messages are learning tools. They tell you what the LLM got wrong so you can ask for it differently next time.


Section 2: Iterative Refinement Pattern

Here's where AI-native development gets powerful: when validation fails, you don't give up—you iterate.

First Attempt: Vague Prompt

def generate_recipe_attempt_1() -> Recipe | None:
"""First try: vague prompt"""
prompt: str = "Generate a recipe for chocolate cookies as JSON."

# In practice, you'd call Claude Code here
# llm_response = claude_code.generate(prompt)

# For demo, simulating what a vague prompt might produce:
llm_response = '''
{
"name": "Chocolate Cookies",
"ingredients": ["flour", "sugar", "chocolate"],
"steps": ["Mix", "Bake"],
"prep_time_minutes": "25 minutes"
}
'''

try:
return Recipe.model_validate_json(llm_response)
except ValidationError as e:
print("❌ First attempt failed:")
for error in e.errors():
print(f" {error['loc'][0]}: {error['msg']}")
return None

# Result: prep_time_minutes validation fails
generate_recipe_attempt_1()

Why it failed: The prompt didn't specify the format for prep_time_minutes. Claude generated a human-readable string instead of a number.

Second Attempt: Improved Prompt

def generate_recipe_attempt_2() -> Recipe | None:
"""Second try: explicit format requirements"""
prompt: str = """
Generate a recipe for chocolate cookies as JSON.

CRITICAL: prep_time_minutes MUST be an integer (whole number of minutes),
NOT a string. Example: 30 (not "30 minutes").

JSON format:
{
"name": "Recipe Name",
"ingredients": ["ingredient1", "ingredient2"],
"steps": ["step1", "step2"],
"prep_time_minutes": <integer>
}
"""

# Simulating improved response
llm_response = '''
{
"name": "Chocolate Cookies",
"ingredients": ["2 cups flour", "1 cup sugar", "1 cup butter", "chocolate chips"],
"steps": ["Cream butter and sugar", "Add eggs", "Mix in flour", "Add chocolate chips", "Bake at 350F"],
"prep_time_minutes": 25
}
'''

try:
recipe: Recipe = Recipe.model_validate_json(llm_response)
print(f"✅ Success! {recipe.name}")
print(f" Prep time: {recipe.prep_time_minutes} minutes")
return recipe
except ValidationError as e:
print("❌ Still failing:")
for error in e.errors():
print(f" {error['loc'][0]}: {error['msg']}")
return None

# Result: ✓ Validation passes!
generate_recipe_attempt_2()

Why this works: By explicitly stating "MUST be an integer" and showing an example (30 not "30 minutes"), you guide the LLM to format the data correctly.

🎓 Instructor Commentary

In AI-native development, validation failures are teaching moments. Each error tells you how to improve your prompt. This cycle—fail, analyze, improve, retry—is how professional AI systems work. You're not "debugging bad code"; you're iteratively refining your specification (prompt) until the AI understands what you want.

🚀 CoLearning Challenge

Ask your AI:

"I need to generate a User profile with fields: username (str), email (str), age (int), is_premium (bool). Generate a sample profile as JSON, then validate it with Pydantic. If validation fails, show me the error and how you'd improve the prompt to fix it."

Expected Outcome: You'll see the full cycle in action—generating, validating, analyzing errors, and iterating on prompts.


Section 3: Error Pattern Analysis

After validating AI outputs for a while, you notice patterns. The same types of errors keep appearing. Understanding these patterns helps you write prompts that prevent failures.

Common LLM Mistakes

Pattern 1: Wrong Data Types

LLM generates: "prep_time_minutes": "30"  (string)
You expect: "prep_time_minutes": 30 (integer)

Prevention: Explicit examples in your prompt
"prep_time_minutes must be an integer. Example: 30 (not '30' or '30 minutes')"

Pattern 2: Missing Fields

LLM generates: {"name": "Cookies", "ingredients": [...]}  (missing "steps")
You expect: All fields required

Prevention: List required fields and show complete example
"All fields required: name, ingredients, steps, prep_time_minutes"

Pattern 3: Unexpected Extra Fields

LLM generates: {"name": "...", "ingredients": [...], "difficulty": "easy", ...}
You expect: Only the fields in your model

Prevention: Use Pydantic's ConfigDict to reject extra fields

Using Field Examples to Guide LLMs

Pydantic's Field() with examples parameter is a powerful hint system:

from pydantic import BaseModel, Field

class Recipe(BaseModel):
name: str = Field(..., description="Recipe name")
ingredients: list[str] = Field(
...,
description="List of ingredients"
)
steps: list[str] = Field(
...,
description="Cooking steps"
)
prep_time_minutes: int = Field(
...,
description="Preparation time in minutes (integer only)",
examples=[15, 30, 45, 60] # Show examples!
)

model_config = {
"json_schema_extra": {
"example": {
"name": "Chocolate Chip Cookies",
"ingredients": ["2 cups flour", "1 cup sugar"],
"steps": ["Mix", "Bake"],
"prep_time_minutes": 30
}
}
}

When you show this model to an LLM, it sees the examples and is more likely to generate correct data.

💬 AI Colearning Prompt

Ask your AI: "What are the top 3 validation errors you see when validating LLM-generated outputs? For each error, show me how to prevent it with better prompt engineering or Pydantic Field definitions with examples."

This conversation clarifies how AI systems fail and how to anticipate failures.


Section 4: FastAPI Integration (Overview)

While this chapter doesn't teach FastAPI deeply (that's for agent framework chapters), you should understand how Pydantic validation is automatic in FastAPI.

The Pattern

When you build a web API with FastAPI, you define request models as Pydantic classes:

from fastapi import FastAPI

app = FastAPI()

class RecipeInput(BaseModel):
name: str
ingredients: list[str]
prep_time_minutes: int

@app.post("/recipes/")
async def create_recipe(recipe: RecipeInput) -> dict[str, str]:
"""Create a recipe. FastAPI automatically validates the input."""
# If validation fails, FastAPI returns a 422 error before your code runs
# If validation passes, recipe is a valid RecipeInput instance
return {"message": f"Recipe '{recipe.name}' created!"}

Magic: FastAPI validates the request body against RecipeInput automatically. If someone sends invalid JSON, FastAPI rejects it with a clear error message before your code ever runs.

You don't write validation code—Pydantic does it for you.

Request Validation

When a user sends a POST request to /recipes/:

{
"name": "Cookies",
"ingredients": ["flour", "sugar"],
"prep_time_minutes": "30 minutes"
}

FastAPI:

  1. Receives the JSON
  2. Validates it against RecipeInput model
  3. If invalid → returns 422 error with helpful message
  4. If valid → deserializes to Python object, calls your function

Response Validation works the same way for outputs. You define a response model:

class RecipeOutput(BaseModel):
id: int
name: str
prep_time_minutes: int

@app.get("/recipes/{id}")
async def get_recipe(id: int) -> RecipeOutput:
"""FastAPI validates that your response matches RecipeOutput"""
# If you return invalid data, FastAPI catches it
return RecipeOutput(id=1, name="Cookies", prep_time_minutes=30)

🎓 Instructor Commentary

FastAPI + Pydantic = automatic API contracts. You describe your data model; FastAPI validates every request and response. This is how professional Python APIs work—validation is built in, not bolted on. You'll use this pattern extensively when building agent systems in Parts 5 and beyond.


Section 5: Production Patterns

In production, validation failures are expected. LLMs make mistakes. Networks fail. Users send bad data. Your job is to design systems that handle these failures gracefully.

Pattern 1: Try-Except with Logging

import logging
from typing import TypeVar

logger = logging.getLogger(__name__)
T = TypeVar("T", bound=BaseModel)

def validate_llm_output(json_string: str, model: type[T]) -> T | None:
"""Validate LLM output with logging"""
try:
return model.model_validate_json(json_string)
except ValidationError as e:
logger.error(f"Validation failed for {model.__name__}")
for error in e.errors():
logger.error(f" Field '{error['loc'][0]}': {error['msg']}")
return None

Always log validation failures. These logs are gold for understanding what's going wrong with your prompts.

Pattern 2: Retry with Prompt Improvement

def generate_and_validate_with_retry(
prompt: str,
model: type[T],
max_attempts: int = 3
) -> T | None:
"""Generate AI output with automatic retry and prompt improvement"""

for attempt in range(max_attempts):
print(f"Attempt {attempt + 1}/{max_attempts}")

# In practice, call your AI here
# llm_response = call_claude_code(prompt)

try:
result: T = model.model_validate_json(llm_response)
print(f"✓ Success on attempt {attempt + 1}")
return result
except ValidationError as e:
print(f"✗ Failed: {e.error_count()} errors")

if attempt < max_attempts - 1:
# Improve prompt based on errors
improved_prompt: str = improve_prompt_from_errors(prompt, e)
prompt = improved_prompt
else:
print("✗ Max attempts reached")
return None

return None

def improve_prompt_from_errors(original: str, error: ValidationError) -> str:
"""Generate improved prompt based on validation errors"""
error_details: str = "\n".join([
f"- {e['loc'][0]}: {e['msg']}"
for e in error.errors()
])

improved: str = f"""
{original}

IMPORTANT: Fix these validation errors from the previous attempt:
{error_details}

Make sure to return ONLY valid JSON matching the schema exactly.
"""

return improved

This pattern automatically iterates on your prompt until validation succeeds or you hit the retry limit.

Pattern 3: Fallback to Human Intervention

When AI can't generate valid data after N retries, escalate:

def generate_with_fallback(
prompt: str,
model: type[T],
max_attempts: int = 3
) -> T | None:
"""Try AI generation, fallback to human if all attempts fail"""

result: T | None = generate_and_validate_with_retry(
prompt,
model,
max_attempts
)

if result is None:
logger.warning(f"AI generation failed for {model.__name__}. Escalating to human.")
# In production: send alert, queue for manual review, etc.
return None

return result

✨ Teaching Tip

In production, validation failures are EXPECTED. Design for them, don't just crash. Always:

  1. Log failures for debugging
  2. Retry with better prompts when possible
  3. Escalate to humans when AI can't fix it
  4. Monitor failure rates to catch systemic problems early

Common Mistakes

Mistake 1: Using AI output without validation

# DON'T DO THIS
recipe_json = call_claude_code("Generate a recipe")
recipe = Recipe(**json.loads(recipe_json)) # Crashes if invalid!

Fix: Always use model_validate_json() with try-except.

Mistake 2: Not giving LLM format examples

# WEAK PROMPT
"Generate a recipe."

# STRONG PROMPT
"Generate a recipe as JSON with exact format:
{
'name': 'string',
'prep_time_minutes': integer (e.g., 30, not '30 minutes')
}"

Mistake 3: Giving up after first failure

AI often succeeds on second or third try with improved prompts. Don't assume failure is permanent.

Mistake 4: Overcomplicating prompts

Start simple. Add detail only when validation fails:

# ITERATION 1 (simple)
"Generate a recipe as JSON."

# ITERATION 2 (add format spec if needed)
"Generate a recipe. prep_time_minutes must be an integer."

# ITERATION 3 (add examples if still failing)
"Generate a recipe. Examples: prep_time_minutes: 30 (not '30 minutes')"

Try With AI

Use your AI companion tool (Claude Code, Gemini CLI, or ChatGPT) to deepen your understanding of AI-native validation patterns.

Prompt 1: Understanding the Validation Loop

Ask your AI:

Explain the "Generate → Validate → Iterate" loop in AI-native development. Why is validation essential rather than optional? What happens if you skip it?

Expected Outcome: Clear explanation of why AI is probabilistic (not deterministic), why validation is critical before using AI output in production, and how iteration improves prompt quality. The answer should emphasize that validation failures are learning opportunities, not bugs.


Prompt 2: Building End-to-End Validation

Tell your AI:

Create a Pydantic model called BlogPost with these fields:

  • title (str)
  • author (str)
  • content (str, at least 100 characters)
  • tags (list[str], 1-5 tags)
  • published_date (str in YYYY-MM-DD format)

Then:

  1. Ask me to generate a sample blog post as JSON
  2. Write code that validates it using Pydantic
  3. Show what happens if the validation fails (e.g., wrong date format)
  4. Explain how to improve the prompt to fix the error

Expected Outcome: Complete BlogPost model with validation, sample AI-generated JSON, working validation code with error handling, and explanation of how to improve prompts based on ValidationError messages. You should see the full cycle: generate → validate → error → fix → retry.


Prompt 3: Analyzing and Fixing Validation Failures

Ask your AI:

You're building a system to validate LLM-generated Recipe models. The model requires prep_time_minutes as an integer. You keep getting this error:

ValidationError: prep_time_minutes: Input should be a valid integer [type=int_parsing, input_value='30 minutes', input_type=str]

Analyze this error. Why is the LLM generating a string instead of an integer? Show me three ways you'd improve the prompt to prevent this. Implement one of them.

Expected Outcome: Analysis of why the LLM misunderstood the requirement (lacking explicit format guidance), three progressively better prompt versions, and implementation of the strongest prompt. You should understand that LLMs respond to clear, example-driven specifications.


Prompt 4: Production-Grade Retry Logic

Tell your AI:

Design a validation strategy for an AI agent that generates structured data 1,000 times per day in production. How would you handle validation failures? What metrics would you track?

Then implement a RetryValidator class that:

  1. Attempts validation up to 3 times
  2. Improves the prompt after each failure
  3. Logs all failures with detailed error information
  4. Returns the valid result or None if all attempts fail

Show an example of how you'd use it in production code.

Expected Outcome: Production-ready retry logic with exponential improvement, structured logging, and clear error handling. You'll see how to:

  • Track success/failure rates (metrics)
  • Escalate after max retries
  • Learn from error patterns
  • Build reliability into AI systems
  • Design for expected failures (not treating them as exceptional)

This demonstrates that professional AI-native development treats validation failures as normal and designs for them proactively.