Configurations are a crucial aspect of any software project. There are many sources of configurations, such as environment variables, configuration files, and command-line arguments. For file-based configurations in python, YAML and TOML (or INI) are popular choices. I prefer YAML, though it is not without flaws, some of which can be addressed by Pydantic anyway like type safety etc.
Pydantic is a data validation library for Python. It is built on top of Python type hints and provides runtime validation of data. Pydantic is widely used for data validation for APIs, but it can also be used for configuration management. Pydantic has a settings management library called pydantic-settings
that makes it easy to load configurations from multiple sources.
In this post, we’ll go through some of my favorite ways to manage configurations using Pydantic and pydantic-settings
. We’ll start with a simple example of loading configurations from a YAML file and then move on to loading configurations from multiple sources.
pip install pydantic>=2 pydantic-settings pyyaml
I’m using pydantic-settings 2.1.0 and pydantic 2.3.0 in the rest of the post.
Let’s start with a simple YAML configuration file.
# config.yaml
host: localhost
port: 5432
username: user
password: password
We can define a Pydantic model to represent this configuration. We are just using pydantic, we’ll use pydantic-settings in more complex examples.
from pydantic import BaseModel
class DatabaseConfig(BaseModel):
host: str
port: int
username: str
password: str
We can then use the pydantic
module to parse the YAML configuration file.
import yaml
from pydantic import ValidationError
with open("config.yaml", "r") as file:
try:
config = DatabaseConfig(**yaml.safe_load(file))
except ValidationError as e:
print("Invalid configuration file", e.json())
We can now access the configuration values using the model attributes.
print(config.host)
print(config.port)
print(config.username)
print(config.password)
The types have been validated. You can define default values, constraints, and more using Pydantic. You can refer to the Pydantic documentation for more information.
Now let’s see how we can use pydantic-settings
to load configurations from multiple sources.
Environment Variables Source
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
host: str = "localhost"
port: int = 5432
username: str = "user"
password: str = "password"
model_config = SettingsConfigDict(env_prefix="APP_")
settings = Settings()
Now, the configurations can be loaded from environment variables with the APP_
prefix. For example, APP_HOST
, APP_PORT
, APP_USERNAME
, and APP_PASSWORD
.
export APP_HOST=example.com
export APP_PORT=8080
export APP_USERNAME=admin
export APP_PASSWORD=secret
But the configurations can no longer be loaded from the YAML file. To load from yaml we need to add another source.
YAML File Source
from abc import abstractmethod
from typing import Any, Dict, Tuple, Type
from pydantic.fields import FieldInfo
from pydantic_settings import BaseSettings
from pydantic_settings import PydanticBaseSettingsSource
import yaml
class BaseFileConfigSettingsSource(PydanticBaseSettingsSource):
def __init__(self, settings_cls: Type[BaseSettings], path: str):
super().__init__(settings_cls)
self._data = self.load_file(path)
@abstractmethod
def load_file(self, path: str) -> Dict[str, Any]:
pass
def get_field_value(
self, field: FieldInfo, field_name: str
) -> Tuple[Any, str, bool]:
if field_name in self._data:
return self._data[field_name], field_name, True
else:
return field.default, field_name, False
def __call__(self) -> Dict[str, Any]:
settings = {}
for field_name, field in self.settings_cls.model_fields.items():
value, _, _ = self.get_field_value(field, field_name)
settings[field_name] = value
return settings
class YamlConfigSettingsSource(BaseFileConfigSettingsSource):
def load_file(self, path: str) -> Dict[str, Any]:
with open(path, "r") as f:
return yaml.safe_load(f)
class Settings(BaseSettings):
...
@classmethod
def settings_customise_sources(
cls,
settings_cls: Type[BaseSettings],
init_settings: PydanticBaseSettingsSource,
env_settings: PydanticBaseSettingsSource,
dotenv_settings: PydanticBaseSettingsSource,
file_secret_settings: PydanticBaseSettingsSource,
) -> Tuple[PydanticBaseSettingsSource, ...]:
return (
env_settings,
YamlConfigSettingsSource(settings_cls, "config.yaml"),
init_settings,
)
Now the configurations can be loaded from both environment variables and the YAML file. The environment variables take precedence over the YAML file. This is generally what I use in my projects. You can override the config file’s values with environment variables. This can be easily extended to TOML or other file formats as long as you can parse them into a dictionary.
Command-line Arguments Source
Another source of configurations can be command-line arguments. I achieved this by just extending the EnvSettingsSource class.
import sys
from typing import Type
from pydantic_settings import BaseSettings
from pydantic_settings.sources import EnvSettingsSource
class CliArgsSource(EnvSettingsSource):
def __init__(self, settings_cls: Type[BaseSettings], prefix: str = "config_"):
super().__init__(settings_cls, env_prefix=prefix)
self._prefix = prefix
self.env_vars = self._load_args()
def _load_args(self):
args = sys.argv[1:]
env_vars = {}
for i in range(len(args)):
if args[i].startswith(f"--{self._prefix}"):
if "=" in args[i]:
key, value = args[i].split("=")
key = key[2:].strip()
env_vars[key] = value.strip()
elif i + 1 < len(args) and not args[i + 1].startswith("--"):
key = args[i][2:].strip()
env_vars[key] = args[i + 1].strip()
return env_vars
class Settings(BaseSettings):
...
@classmethod
def settings_customise_sources(
cls,
settings_cls: Type[BaseSettings],
init_settings: PydanticBaseSettingsSource,
env_settings: PydanticBaseSettingsSource,
dotenv_settings: PydanticBaseSettingsSource,
file_secret_settings: PydanticBaseSettingsSource,
) -> Tuple[PydanticBaseSettingsSource, ...]:
return (
env_settings,
YamlConfigSettingsSource(settings_cls, "config.yaml"),
CliArgsSource(settings_cls, "app_"),
init_settings,
)
Now you can pass command-line arguments like --app_host=example.com
to override the configurations from the YAML file and environment variables. This way, you can have a single source of truth for all your configurations.
Nested Configurations
Pydantic also supports nested models, which can be useful for complex configurations. Let’s say you have a nested configuration like this:
# config.yaml
database:
host: localhost
port: 5432
username: user
password: password
app:
name: myapp
version: 1.0
You can define nested models like this:
from pydantic import BaseModel
from pydantic_settings import BaseSettings, SettingsConfigDict
class DatabaseConfig(BaseModel):
host: str
port: int
username: str
password: str
class AppConfig(BaseModel):
name: str
version: str
class Settings(BaseSettings):
database: DatabaseConfig
app: AppConfig
model_config = SettingsConfigDict(env_prefix="APP_", env_nested_delimiter="__")
@classmethod
def settings_customise_sources(
cls,
settings_cls: Type[BaseSettings],
init_settings: PydanticBaseSettingsSource,
env_settings: PydanticBaseSettingsSource,
dotenv_settings: PydanticBaseSettingsSource,
file_secret_settings: PydanticBaseSettingsSource,
) -> Tuple[PydanticBaseSettingsSource, ...]:
return (
env_settings,
CliArgsSource(settings_cls, "app_"),
YamlConfigSettingsSource(settings_cls, "config.yaml"),
init_settings,
)
You can pass the nested configurations as environment variables with the separator __
.
For example,
export APP_DATABASE__HOST=example.com
export APP_DATABASE__PORT=8080
export APP_DATABASE__USERNAME=admin
export APP_DATABASE__PASSWORD=secret
export APP_APP__NAME=myapp
export APP_APP__VERSION=1.0
And since we are using EnvSettingsSource
as base class for CliArgsSource
, you can pass nested configurations as command-line arguments like
python app.py --app_database__host=example.com --app_database__port=8080 --app_database__username=admin --app_database__password=secret --app_app__name=myapp --app_app__version=1.0
Conclusion
Pydantic and pydantic-settings
provide a powerful way to manage configurations in Python. You can load configurations from multiple sources like environment variables, YAML files, and command-line arguments. You can also define nested configurations and customize the sources to suit your needs. This makes it easy to manage configurations in a consistent and type-safe way. I hope this post helps you keep your configurations sane in your Python projects.