Pydantic 入门:Python 中的数据验证
学习 Pydantic——一个使用类型提示进行数据验证和解析的 Python 库。了解如何定义模型、验证数据、处理嵌套结构,并将其与 FastAPI 等 Web 框架集成。
Tutorial Progress
1 Welcome to Pydantic!
Ever found yourself wading through incoming data—maybe from an API, a database, or some user input—and thinking, "Is this even the right shape?" Pydantic steps in to tidy up that particular mess. It's a Python library that, frankly, makes data validation and parsing almost pleasurable.
Think of it as a meticulous gatekeeper for your data. You tell Pydantic what your data should look like using standard Python type hints, and it rigorously checks everything arriving at the gate. If something doesn’t match, it throws a polite, yet firm, error. This isn't just about catching mistakes early; it’s about making your code more robust, readable, and generally less prone to unexpected tantrums down the line.
- Type Safety: Enforces expected data types, reducing runtime errors.
- Automatic Validation: Converts raw data into validated objects effortlessly.
- Serialization: Easily convert models back to dictionaries or JSON.
- Great for APIs: A cornerstone for frameworks like FastAPI, handling request and response models.
Ready to make your data behave? Click "Next" to get Pydantic installed.
2 Getting Pydantic Installed
Before Pydantic can start its good work, you need to bring it into your project. If you've been working with FastAPI, chances are Pydantic is already lurking somewhere in your dependencies, as FastAPI relies on it quite heavily under the hood. However, for a standalone project, a simple pip install does the trick.
First, make sure you're inside your project's virtual environment. If you need a refresher on setting one up, head back a step in your usual FastAPI tutorials—it's always a good habit. Once your environment is active, run this command:
pip install pydantic
Once that's done, Pydantic is ready to go. You won't see a parade, but trust me, it's there, waiting patiently to bring order to your Python objects.
3 Your First Model - It's Like a Class, But Better
At its core, Pydantic extends Python's data classes by adding a validation layer. You define a class that inherits from pydantic.BaseModel, then use standard Python type hints for each attribute. Pydantic takes those hints and turns them into strict rules.
Let's create a simple User model. This model will expect a name (a string) and an age (an integer). If you try to give it anything else, Pydantic will politely—or not so politely, depending on your perspective—tell you it's a no-go.
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
# Now you have a blueprint for User data.
See? It looks just like a regular Python class. The magic truly happens when you try to create an instance of it.
4 Validation in Action - Catching the Nasty Bits
Defining a model is only half the fun. The real show begins when you feed it some data. Pydantic will gobble it up, check it against your defined types, and either hand you back a pristine object or, well, spit it out with a ValidationError.
Let's see our User model handle both good and bad data. We'll start with a perfectly valid user, then try to sneak in an invalid age—say, a string instead of an integer. Pydantic won't be fooled.
from pydantic import BaseModel, ValidationError
class User(BaseModel):
name: str
age: int
# This will work perfectly
valid_user = User(name="Alice", age=30)
print(f"Valid User: {valid_user}")
# Output: Valid User: name='Alice' age=30
print("\n--- Trying invalid data ---")
try:
invalid_user = User(name="Bob", age="twenty")
except ValidationError as e:
print(f"Caught an error! \n{e}")
# Output will show a ValidationError indicating 'value is not a valid integer'
Notice how Pydantic not only detects the type mismatch but also gives you a rather helpful error message. This feedback loop is golden for debugging and building robust applications.
5 Type Hints - Python's Gift to Pydantic
The entire Pydantic validation system hinges on Python's type hints. These aren't just for making your IDE happy; Pydantic actively uses them to enforce schemas. Beyond basic types like str and int, Python's typing module offers a treasure trove of more complex hints.
One incredibly useful hint is Optional. When a field might or might not be present, or could be None, Optional (from the typing module) is your friend. It signals to Pydantic that, while a value is expected, None is also a valid input.
from pydantic import BaseModel
from typing import Optional # Don't forget this import!
class Product(BaseModel):
name: str
price: float
description: Optional[str] # This field can be a string or None
# Valid products
product1 = Product(name="Laptop", price=1200.50, description="Powerful computing on the go.")
product2 = Product(name="Mouse", price=25.00, description=None) # Explicitly None is fine
product3 = Product(name="Keyboard", price=75.99) # Omitting it also defaults to None
print(f"Product 1: {product1}")
print(f"Product 2: {product2}")
print(f"Product 3: {product3}")
Using Optional[str] is syntactical sugar for Union[str, None], making your intentions crystal clear. It's a small detail, but it prevents a lot of confusion and unexpected None-related errors.
6 Default Values & Optional Fields - Making Things Flexible
Sometimes a field isn't strictly necessary, or maybe it should just have a sensible default if no value is provided. Pydantic plays nicely with Python's default argument values, letting you set them right in your model definition. This makes your models robust to missing data while keeping things predictable.
We'll update our Product model to include a default quantity and an optional tag list, which defaults to an empty list. This means you don't always have to provide these values, which is nice for less verbose data input.
from pydantic import BaseModel
from typing import Optional, List # Added List for type hinting lists
class Product(BaseModel):
name: str
price: float
description: Optional[str] = None # Defaulting description to None
quantity: int = 1 # Default quantity to 1 if not provided
tags: List[str] = [] # Default to an empty list of strings
# Create products, omitting some fields
product_default_quantity = Product(name="Headphones", price=99.99)
product_with_tags = Product(name="Smartwatch", price=250.00, tags=["wearable", "tech"])
print(f"Product with default quantity: {product_default_quantity}")
print(f"Product with tags: {product_with_tags}")
# What happens if we provide a non-list for tags?
try:
bad_tags_product = Product(name="Mug", price=10.0, tags="ceramic")
except Exception as e:
print(f"\nCaught an error with bad tags: \n{e}")
When defining default values for mutable types like lists or dictionaries, always use `List[str] = []` directly in the assignment, or consider `Field(default_factory=list)` if you're worried about shared mutable state (though Pydantic often handles this gracefully internally for simple assignments). Pydantic knows to validate the types within the list too, proving it's more than just a surface-level check.
7 Nested Models - When Things Get Complicated
Real-world data rarely comes in flat, simple structures. Often, you'll have objects containing other objects—a user might have an address, which itself has a street, city, and zip code. Pydantic handles these nested structures beautifully by allowing you to embed one BaseModel within another.
Let's refine our User model. We'll introduce an Address model first, then include an instance of that Address model as a field in our User. Pydantic will automatically validate the nested model when you instantiate the parent.
from pydantic import BaseModel, ValidationError
class Address(BaseModel):
street: str
city: str
zip_code: str
class User(BaseModel):
name: str
age: int
address: Address # This field expects an Address model!
# A valid user with a nested address
user_with_address = User(
name="Charlie",
age=45,
address={
"street": "123 Main St",
"city": "Anytown",
"zip_code": "12345"
}
)
print(f"User with address: {user_with_address}")
# What happens if the nested address is invalid?
try:
invalid_address_user = User(
name="Diana",
age=28,
address={
"street": "456 Oak Ave",
"city": "Otherville",
"zip_code": 98765 # Zip code as an int instead of str!
}
)
except ValidationError as e:
print(f"\nCaught an error for invalid nested address: \n{e}")
Pydantic gracefully handles the recursion, ensuring that even deeply nested data adheres to its specified types. It's a fantastic way to build complex, self-describing data structures.
8 Lists and Dictionaries - Handling Collections
Beyond single nested models, data often comes in collections—lists of items or dictionaries mapping keys to values. Pydantic, using Python's typing module, extends its validation prowess to these common data structures too.
Let's give our User model some friends (a list of other User models, perhaps?) and a list of hobbies. We'll also add a dictionary for arbitrary metadata, just to show how Pydantic tackles these flexible types.
from pydantic import BaseModel
from typing import List, Dict, Optional # Need these for lists and dicts
# Re-using our User model for nesting
class User(BaseModel):
name: str
age: int
# No Address for simplicity in this example, but it could be here!
hobbies: List[str] = []
friends: List['User'] = [] # Forward reference for self-referencing models
metadata: Dict[str, str] = {} # A dictionary of string keys to string values
# Create some users
friend1 = User(name="Eve", age=25, hobbies=["reading"])
friend2 = User(name="Frank", age=27, hobbies=["gaming", "coding"])
# A main user with a list of friends and hobbies
main_user = User(
name="Grace",
age=32,
hobbies=["hiking", "photography", "cooking"],
friends=[friend1, friend2],
metadata={"status": "active", "level": "senior"}
)
print(f"Main User with friends and hobbies:\n{main_user.model_dump_json(indent=2)}")
# What if a friend isn't a User model?
try:
bad_friends_user = User(
name="Harry",
age=40,
friends=[{"name": "Invalid Friend", "age": "not-an-int"}] # Invalid friend data
)
except Exception as e:
print(f"\nCaught an error with bad friend data: \n{e}")
Pydantic dives deep into collections, validating each item against its specified type. For self-referencing models (like a User having a list of User friends), you can use a forward reference, a string literal of the model name, like 'User', which Pydantic resolves later. It's quite clever how it handles these recursive dances.
9 Pydantic and FastAPI - A Match Made in Heaven
If you've been working with FastAPI, you've likely encountered Pydantic without even realizing it. FastAPI uses Pydantic BaseModels extensively for defining the structure of incoming request bodies, query parameters, and outgoing response models. This tight integration is one of FastAPI's superpowers.
When you define a route in FastAPI and use a Pydantic model as a parameter type, FastAPI automatically does a few things:
- It parses the incoming JSON request body into your Pydantic model.
- It validates the data against your model's type hints and constraints.
- If validation fails, it automatically returns a clear error response (a 422 Unprocessable Entity, typically).
- It generates API documentation (OpenAPI/Swagger UI) based on your Pydantic models.
You declare your data structure once with Pydantic, and FastAPI leverages that declaration for validation, serialization, and documentation—a truly efficient workflow. Here's a tiny glimpse, though setting up FastAPI is another tutorial entirely:
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class Item(BaseModel): # A Pydantic model for our item data
name: str
description: str | None = None # Python 3.10+ syntax for Optional
price: float
tax: float | None = None
@app.post("/items/")
async def create_item(item: Item): # FastAPI automatically expects an 'Item' Pydantic model
return item.model_dump() # Converts the Pydantic model back to a dict
This tight coupling removes a mountain of boilerplate code you'd otherwise write for validation and data transformation. It’s a bit of a magic trick, isn't it?
10 Exporting Your Data - `model_dump` and `model_dump_json`
Once you've got your data happily validated inside a Pydantic model, you often need to send it back out into the world. Maybe you're returning a JSON response from an API, saving to a database, or just debugging. Pydantic makes this conversion back to plain Python dictionaries or JSON strings straightforward.
For Pydantic V2 (the current stable version), you'll use model_dump() to get a Python dictionary representation and model_dump_json() for a JSON string. These methods handle all the nesting and type conversions automatically, ensuring your exported data looks just as good as it did when it came in.
from pydantic import BaseModel
from typing import List
class Tag(BaseModel):
name: str
color: str
class Article(BaseModel):
title: str
content: str
tags: List[Tag] = []
is_published: bool = False
# Create an article
article = Article(
title="Pydantic Power",
content="A deep dive into data validation with Pydantic.",
tags=[Tag(name="python", color="blue"), Tag(name="fastapi", color="green")],
is_published=True
)
# Convert to a Python dictionary
article_dict = article.model_dump()
print(f"Article as a dictionary:\n{article_dict}")
# Output: {'title': 'Pydantic Power', 'content': 'A deep dive into data validation with Pydantic.', 'tags': [{'name': 'python', 'color': 'blue'}, {'name': 'fastapi', 'color': 'green'}], 'is_published': True}
# Convert to a JSON string
article_json = article.model_dump_json(indent=2) # indent for pretty printing
print(f"\nArticle as a JSON string:\n{article_json}")
These methods are your go-to for making Pydantic objects play nicely with external systems. They ensure consistent output, simplifying your serialization logic dramatically.
11 What's Next? - Beyond the Basics
You’ve now had a proper introduction to Pydantic, seen its core powers in action, and even caught a glimpse of its best friend, FastAPI. What we've covered barely scratches the surface, but it's more than enough to get you started building robust, type-safe data models.
From here, you can dive into some of Pydantic's more advanced features:
- Custom Validators: Use
@field_validatorto create your own validation logic for specific fields, ensuring data meets even the most arcane business rules. - Pydantic Settings: Manage application settings and environment variables with ease, leveraging Pydantic's validation for configuration.
- More Complex Types: Explore
Enums,UUIDs,datetimeobjects, and even generic models for truly flexible data structures. Fieldwith Extra Constraints: Add more granular validation rules like minimum/maximum lengths for strings, or greater-than/less-than for numbers.
You’ve got a firm handle on Pydantic's foundational concepts. Next, you can either immediately start refactoring an existing project to use Pydantic models for better data hygiene, or leap directly into building a FastAPI application, where Pydantic truly shines. Go on, give it a whirl!
Explore Pydantic's official documentation for a deeper dive.