Skip to main content

Introduction to Pydantic and Data Validation

The Validation Problem

Flow diagram showing Pydantic validation process from raw input data through BaseModel validation layer with type checking and constraints, to validated output or ValidationError exceptions

Imagine you're building an AI agent that accepts user data. A user registers with their name, email, and age. But what happens if someone submits:

{
"name": "Alice",
"email": "not-an-email",
"age": "twenty-five"
}

Your code might crash silently, store invalid data, or worse—send bad data to your AI, which then generates incorrect responses. The problem: Python's type hints only document what SHOULD be there, but don't enforce what IS actually there at runtime.

This is where Pydantic enters the game. Pydantic is a library that validates data at runtime—it checks that your data actually matches your requirements before your code uses it. Type hints say "this SHOULD be an int"; Pydantic makes it "this MUST be an int or validation fails."

Why This Matters for AI-Native Development

When Claude Code generates JSON for you, you need to validate it's correct BEFORE using it. When you build APIs with FastAPI, Pydantic automatically validates every request. When you load configuration files, Pydantic ensures they're valid. In production systems, validation is not optional—it's your safety net.


Section 1: Your First Pydantic Model

Installing Pydantic

Like any Python library, Pydantic needs to be installed first. You've already learned this pattern in Chapter 17 with uv:

uv add pydantic

This installs Pydantic V2 (the modern version). Pydantic V1 is deprecated—always use V2.

Creating Your First Model: A Book

Let's start simple. Imagine you're building a library application that stores books. Each book has:

  • title (text, required)
  • author (text, required)
  • year (whole number, required, between 1000-2100)
  • price (decimal number, required, must be >= 0)
  • isbn (text, optional)

With Pydantic, you describe this structure in code:

Loading Python environment...

That's it. You've created a Pydantic model. Now let's use it.

Validation Happens Automatically

Creating a valid book works exactly as you'd expect:

Loading Python environment...

But try passing invalid data:

Loading Python environment...

Output (showing what validation catches):

2 validation errors for Book
year
Input should be a valid integer [type=int_type, input_value='not a year', input_type=str]
price
Input should be greater than or equal to 0 [type=greater_than_equal, input_value=-10, input_type=float]

Pydantic caught BOTH errors at once. This is powerful—you don't have to debug one error, fix it, then discover another. You see everything that's wrong.

💬 AI Colearning Prompt

"What happens when you pass a string to an int field in Pydantic? Explain the validation error and what type coercion means."


Section 2: Understanding Validation Errors

Reading ValidationError Messages

Pydantic's error messages are designed to help you. Let's break down what you're seeing:

Loading Python environment...

This gives you:

  • loc (location): Which field has the problem?
  • msg (message): What's wrong in plain English?
  • type (type of error): Was it a type mismatch? A constraint violation? A format issue?

🎓 Expert Insight

In AI-native development, type hints document intent but Pydantic enforces it. When AI agents generate JSON or APIs send data, runtime validation catches mismatches before they corrupt your system. This isn't defensive programming—it's professional practice.

Multiple Errors at Once

One of Pydantic's superpowers is reporting ALL validation problems simultaneously. This saves debugging time:

Loading Python environment...


Section 3: Nested Models

Real Data Is Complex

So far we've created flat models with simple fields. But real data is hierarchical. A Book might have an Author, and an Author has multiple attributes:

Loading Python environment...

Notice authors: list[Author]—this is a list of Author models. Pydantic validates each Author in the list.

Using Nested Models

Creating a book with authors:

Loading Python environment...

Validation happens at all levels. If an Author's name is missing, Pydantic catches it:

Loading Python environment...

🤝 Practice Exercise

Ask your AI: "Create an Author model with name and bio fields. Then create a Book model that contains a single author field (not a list—just one Author). Generate code that creates a Book with a nested Author and demonstrates the validation error when author data is missing."

Expected Outcome: You'll see working nested model structure and understand how Pydantic validates nested fields, catching missing required fields at any level of nesting.


Section 4: Common Mistakes

Mistake 1: Forgetting BaseModel

Pydantic models must inherit from BaseModel:

Loading Python environment...

Mistake 2: Not Handling ValidationError

If you don't catch ValidationError, your program crashes:

Loading Python environment...

Mistake 3: Mixing Up Type Hints

Type hints must be precise. list is different from list[str]:

Loading Python environment...


Try With AI

Apply Pydantic data validation through AI collaboration that builds type-safe application skills.

🔍 Explore Validation Pain:

"Compare manual validation for user registration (username 3-20 chars, email with @, age 13-120) versus Pydantic BaseModel with Field() constraints. Show why runtime validation matters beyond type hints."

🎯 Practice Field Constraints:

"Build a User model with Pydantic validating: username (pattern r'^[a-z0-9_]+$'), email (@field_validator for domain check), age (ge=13, le=120), optional bio (max 200 chars). Handle ValidationError."

🧪 Test Edge Cases:

"Test Pydantic model with: '25' (string as int), 'test@localhost' (no domain dot), 120.5 (float as int), 201-char bio. Show how Pydantic coerces types and where custom validators are needed."

🚀 Apply Production Patterns:

"Create a complete user validation system with Pydantic showing: all errors at once (not first-fail), clear error messages, type coercion (str → int), custom validators, and explain when to use Field() vs @field_validator."