Skip to main content

Capstone: Robust CSV Parser

You've learned the foundations of exception handling across four lessons. Now it's time to put everything together in a realistic project. In this capstone, you'll build a CSV file parser that reads user data, validates each record, and handles multiple error scenarios gracefully. This project integrates all the exception handling concepts you've learned—try/except/else/finally blocks, custom exceptions, strategic error recovery, and testing your error handling.

By the end of this lesson, you'll have a working program that demonstrates professional-quality error handling. You'll understand how to think defensively about code, anticipate problems, and build systems that don't crash when things go wrong.

Project Specification

Your mission is to build a Python program that reads a CSV file and validates user records. Here's what your parser must do:

Input File Format (CSV with three columns):

name,age,email
Alice,28,[email protected]
Bob,thirty-five,[email protected]
Charlie,45,charlie@invalid

Validation Rules:

  1. Name: Must be non-empty string
  2. Age: Must be a positive integer between 0 and 150
  3. Email: Must contain '@' symbol

Error Handling Requirements:

  1. FileNotFoundError: File doesn't exist → Tell user the location they tried
  2. ValueError: Malformed data in a row → Log error, skip row, continue processing
  3. PermissionError: Can't read file → Tell user permissions issue
  4. General errors: Unexpected issues → Report what happened

Output: Report summary showing:

  • Total rows processed
  • Rows successfully validated
  • Rows skipped due to errors (with reasons)
  • Log entries for debugging

Success Criteria:

  • Parser never crashes, even with bad data
  • Every error scenario produces helpful feedback
  • Summary report shows what happened
  • User can understand what went wrong and potentially fix it

Planning Your Error Handling Strategy

Before diving into code, let's think strategically about where errors could occur and how to handle them.

Where Errors Happen:

  • Opening the file: FileNotFoundError, PermissionError
  • Reading each line: Malformed CSV structure (rare, but possible)
  • Validating age field: ValueError when int() fails on non-numeric string
  • Validating email: No error from validation logic itself (just checking for '@')

Error Handling Strategy by Scenario:

ErrorRoot CauseStrategyAction
FileNotFoundErrorFile path incorrectReport and exitTell user where to find file
PermissionErrorUser lacks read rightsReport and exitTell user to check permissions
ValueError (age)Non-numeric age valueSkip rowLog the bad value, continue
Invalid emailNo '@' symbolSkip rowLog which email was invalid, continue

Notice the pattern: fatal errors (file access) stop the program early; data validation errors skip just that row and continue.

💬 AI Colearning Prompt

"Walk me through what errors could happen when reading a CSV file. Where would try/except blocks go in my parser?"

Think about this: errors in opening the file are different from errors in validating individual rows. One stops everything; the other should let you continue.

Implementation Strategy

We'll build the parser step-by-step, starting simple and adding error handling for each scenario. The key is testing each error case as you go.

Step 1: Validation Functions

First, let's create functions that validate each field and raise exceptions when needed:

Loading Python environment...

🎓 Expert Insight

Notice that validation functions raise exceptions rather than returning success/failure flags. This is professional Python style. Your function either returns the validated data or signals an error. No ambiguity.

Step 2: CSV Parser with Error Handling

Now let's build the parser that reads the file and validates each row:

Loading Python environment...

🚀 CoLearning Challenge

Ask your AI Co-Teacher:

"Review my CSV parser. What error scenarios am I NOT handling? What edge cases might break my code?"

Expected Outcome: You'll discover edge cases like empty files, lines with trailing spaces, or unusual CSV formatting. This is how professionals build robust systems—they think about what could go wrong.

Step 3: Main Program with Complete Error Handling

Here's the complete program that ties everything together:

Loading Python environment...

Output (with valid data):

==================================================
CSV PARSING SUMMARY
==================================================
Total rows processed: 3
Valid records: 3
Invalid records: 0

Valid Records:
- Alice (28) [email protected]
- Bob (35) [email protected]
- Charlie (45) [email protected]

==================================================

✨ Teaching Tip

Use Claude Code to test your parser with different inputs. Ask: "Create sample CSV files with valid data, malformed data, permission errors, and missing files. Test my parser against each."

Notice the error-handling hierarchy:

  • Top level (main()) catches fatal errors (file access)
  • Middle level (parse_csv_file()) processes data and logs row errors
  • Bottom level (validation functions) validate individual fields

Testing Your Parser

Professional programmers test error handling as rigorously as happy-path scenarios. Let's build test data for each error case.

Test Case 1: Valid Data

Create a file called users.csv:

name,age,email
Alice,28,[email protected]
Bob,35,[email protected]
Charlie,45,[email protected]

Expected output:

Total rows processed: 3
Valid records: 3
Invalid records: 0

Test Case 2: Missing File

Try running with a non-existent file:

Loading Python environment...

Expected error:

ERROR: File not found: nonexistent.csv. Check the file path and try again.
Stopping. Please check the file path and try again.

Test Case 3: Malformed Data

Create users_bad_data.csv:

name,age,email
Alice,twenty-eight,[email protected]
Bob,,[email protected]
Charlie,150,charlie-at-example.com
Dave,35,[email protected]

Expected output:

Total rows processed: 4
Valid records: 1
Invalid records: 3
Errors Encountered:
- Row 2: Invalid age: twenty-eight
- Row 3: Age must be 0-150, got
- Row 4: Email must contain '@', got: charlie-at-example.com

Test Case 4: File Permissions (Advanced)

On Linux/Mac, you can test permissions:

# Create a file
echo "name,age,email" > restricted.csv

# Remove read permissions
chmod 000 restricted.csv

# Run your parser (will get PermissionError)
# Restore permissions when done
chmod 644 restricted.csv

Expected error:

ERROR: Permission denied reading restricted.csv. Check file permissions.
Stopping. Please check file permissions and try again.

Debugging Common Issues

When your parser doesn't work as expected, use this checklist:

Issue: Parser crashes with uncaught exception

  • Check: Did you add try/except around the file open?
  • Fix: Wrap file operations in try/except for FileNotFoundError and PermissionError

Issue: Valid records missing even though validation logic looks correct

  • Check: Are you raising exceptions in validation functions when rules fail?
  • Fix: Use raise ValueError() to signal validation failures

Issue: Error messages aren't helpful

  • Check: Are you including context (row number, field name, actual value)?
  • Fix: Build error messages like f"Row {row_num}: Invalid age: {age_str}"

Issue: Parser stops on first error instead of continuing

  • Check: Where are the try/except blocks? Is row validation inside the loop?
  • Fix: Wrap individual row processing in try/except; only let fatal errors escape

Extending Your Parser

Once your basic parser works, here are ways to extend it:

Extension 1: Support Multiple File Formats

  • Read JSON files instead of (or in addition to) CSV
  • Detect format by file extension
  • Create separate validation for each format

Extension 2: Implement Retry Logic

  • If file is temporarily locked, retry 3 times before giving up
  • Add exponential backoff between retries

Extension 3: Advanced Error Recovery

  • For invalid age, suggest valid range in error message
  • For invalid email, suggest what valid email looks like
  • Let user correct errors and re-validate

Extension 4: Logging and Audit Trail

  • Write all validation errors to a log file
  • Include timestamp for each error
  • Create audit trail of what was processed

Challenge Exercise: Pick one extension above and implement it. Use Claude Code to help design the additional exception handling needed.


Try With AI

Build a production-grade CSV parser integrating all Chapter 26 exception handling concepts.

🔍 Explore Error Architecture:

"Design CSV parser error handling for reading name, age, email columns with validation (name non-empty, age 0-150, email has @). Categorize errors as fatal (FileNotFoundError, PermissionError—stop program) vs recoverable (ValueError—skip row). Explain why file errors caught outside loop, validation errors inside loop."

🎯 Practice Fatal vs Recoverable:

"Implement CSV parser architecture. Handle FileNotFoundError (show filename, exit), PermissionError (retry 3 times with delay), and ValueError (skip row, log error with row number). Show outer try/except for file operations and inner try/except for row validation."

🧪 Test Test-Driven Error Handling:

"Write pytest tests BEFORE implementing parser: (1) Valid data expects 2 records/0 errors, (2) Mixed data expects 1 valid/2 errors with row numbers, (3) Missing file expects FileNotFoundError. Show assertions checking result dict structure and error message content."

🚀 Apply Production Features:

"Build complete parser with validation functions (validate_name, validate_age, validate_email raising ValueError), parse_csv_file returning {'valid': [...], 'invalid': [...], 'total': int}, and main function with summary report. Extend with: retry logic, error logging to parser.log, graceful degradation for all-invalid data, and reflection integrating all Chapter 26 concepts."



Safety and Ethics Note

When working with file operations and user data:

  • Never expose file paths or system errors to end users in production (log them, but show friendly messages)
  • Validate input data carefully—malformed CSV files can cause issues if not handled properly
  • Consider data privacy—if CSV contains sensitive information, handle it carefully and don't expose details in error messages

Next Steps

You've now completed all five lessons on exception handling. You've learned to:

  1. Understand what exceptions are and how try/except prevents crashes
  2. Use multiple except blocks, else, and finally for sophisticated error handling
  3. Write functions that raise custom exceptions for domain-specific errors
  4. Apply error handling strategies (retry, fallback, graceful degradation, logging)
  5. Build a realistic project integrating all these concepts

From here, Chapter 27 (IO & File Handling) builds directly on this foundation—exception handling is the primary tool for safe file operations. You're ready for that next step.