On December 29th, 2022, zanie opened this issue to suggest we (prefecthq/prefect) migrate from the python package setup of old (e.g. setup.py, setup.cfg) to the now-standard pyproject.toml paradigm for defining project and build configuration.
Beyond being modern best practice, it's just a lot more convenient to have everything - all your pytest, ruff etc. config - all in the same place, and in case you don't trust my opinion, here's another reason: uv's top-level API (sync, lock etc.) requires a pyproject.toml 🙂.
Here we are, over 2 years later 😅 but we've finally come around to it. prefect is a large codebase that historically required these config files for linting, packaging, tests, etc.:
and now, all of that is replaced by a single pyproject.toml file
Let's walk through each major aspect of our package configuration and how we migrated it, focusing on the benefits we've seen.
Previously, we had a setup.py that handled our package metadata and dependencies by reading from multiple requirements files:
1from pathlib import Path
2import versioneer
3from setuptools import find_packages, setup
4
5def read_requirements(file: str) -> list[str]:
6 requirements: list[str] = []
7 if Path(file).exists():
8 requirements = open(file).read().strip().split("\\n")
9 return requirements
10
11client_requires = read_requirements("requirements-client.txt")
12install_requires = read_requirements("requirements.txt")[1:] + client_requires
13dev_requires = read_requirements("requirements-dev.txt")
14otel_requires = read_requirements("requirements-otel.txt")
15
16setup(
17 name="prefect",
18 description="Workflow orchestration and management.",
19 packages=find_packages(where="src"),
20 package_dir={"": "src"},
21 python_requires=">=3.9",
22 install_requires=install_requires,
23 extras_require={
24 "dev": dev_requires,
25 "otel": otel_requires,
26 "aws": "prefect-aws>=0.5.0",
27 # ... many more extras
28 },
29)
This approach had several drawbacks:
Now with hatch (a modern Python build tool), all of this lives directly in pyproject.toml:
1[project]
2name = "prefect"
3description = "Workflow orchestration and management."
4requires-python = ">=3.9"
5dependencies = [
6 "aiosqlite>=0.17.0,<1.0.0",
7 "alembic>=1.7.5,<2.0.0",
8 # ... more dependencies
9]
10
11[project.optional-dependencies]
12aws = ["prefect-aws"]
13# ... many more extras
14
15[dependency-groups]
16dev = ["..."] # all of our dev dependencies
Note that dev is in the [dependency-groups] table and not in [project.optional-dependencies] - dependency groups are a modern python standard that allows you to group dependencies in a way that won't be exposed in published project metadata. This dev group receives a little bit of special treatment from uv (which we’ll see later on when we run our tests).
This consolidation creates a single source of truth for our dependencies, making it immediately clear what's required for each domain of the project. It also enables us to use modern tools like uv that can automatically manage our environment based on this configuration.
Prefect has a core library and multiple integration packages (AWS, GCP, Kubernetes, etc.) that live in the same repository. With our new setup, we've configured each integration package with its own pyproject.toml, while using uv's source references to link them together during development.
In our main pyproject.toml, we define paths to all integration packages:
1[tool.uv.sources]
2prefect-aws = { path = "src/integrations/prefect-aws" }
3prefect-azure = { path = "src/integrations/prefect-azure" }
4prefect-gcp = { path = "src/integrations/prefect-gcp" }
5# ... other integrations
And in each integration package's pyproject.toml, we reference the main Prefect package:
1# In src/integrations/prefect-aws/pyproject.toml
2[tool.uv.sources]
3prefect = { path = "../../../" }
This approach allows us to:
Once you have the prefect repo cloned, install dependencies for all integrations by running:
1uv sync --all-extras
Because integrations depend on prefect from PyPI, the old setup was especially bad when working with editable integrations locally. When developing you needed to install from an integration root, then back to project root and install prefect editable to get changes from core. Beyond tedium, this often broke editors’ understanding of the resulting venv in annoying ways.
Version management used to be handled by a customized versioneer.py with more configuration in setup.cfg:
1[versioneer]
2VCS = git
3style = pep440
4versionfile_source = src/prefect/_version.py
5versionfile_build = prefect/_version.py
6version_regex = ^(\\d+\\.\\d+\\.\\d+(?:[a-zA-Z0-9]+(?:\\.[a-zA-Z0-9]+)*)?)$
When evaluating modern alternatives, we initially looked at hatch-vcs, but found it didn't offer the same level of customization we needed to maintain continuity with our existing versioning scheme. Specifically, we needed to:
After exploring several options, we settled on versioningit, which provides the flexibility we needed while integrating nicely with hatch:
1[tool.hatch.version]
2source = "versioningit"
3
4[tool.versioningit.vcs]
5match = ["[0-9]*.[0-9]*.[0-9]*", "[0-9]*.[0-9]*.[0-9]*.dev[0-9]*"]
6default-tag = "0.0.0"
7
8[tool.versioningit.format]
9distance = "{base_version}+{distance}.{vcs}{rev}"
10dirty = "{base_version}+{distance}.{vcs}{rev}.dirty"
11distance-dirty = "{base_version}+{distance}.{vcs}{rev}.dirty"
One particularly nice feature of versioningit is the ability to write version information to a file during build time using a custom script. This allowed us to maintain backward compatibility with code that relied on our previous version information format.
<details>
<summary>custom script</summary>
1import textwrap
2from datetime import datetime, timezone
3from pathlib import Path
4from subprocess import CalledProcessError, check_output
5from typing import Any
6
7
8def write_build_info(
9 project_dir: str | Path, template_fields: dict[str, Any], params: dict[str, Any]
10) -> None:
11 """
12 Write the build info to the project directory.
13 """
14 path = Path(project_dir) / params.get("path", "src/prefect/_version.py")
15
16 try:
17 git_hash = check_output(["git", "rev-parse", "HEAD"]).decode().strip()
18 except CalledProcessError:
19 git_hash = "unknown"
20
21 build_dt_str = template_fields.get(
22 "build_date", datetime.now(timezone.utc).isoformat()
23 )
24 version = template_fields.get("version", "unknown")
25 dirty = "dirty" in version
26
27 build_info = textwrap.dedent(
28 f"""\\
29 # Generated by versioningit
30 __version__ = "{version}"
31 __build_date__ = "{build_dt_str}"
32 __git_commit__ = "{git_hash}"
33 __dirty__ = {dirty}
34 """
35 )
36
37 with open(path, "w") as f:
38 f.write(build_info)
</details>
1[tool.versioningit.write]
2method = { module = "write_build_info", value = "write_build_info", module-dir = "tools" }
3path = "src/prefect/_build_info.py"
Our old build configuration was split between setup.py and setup.cfg. Now it's all handled by hatch:
1[build-system]
2requires = ["hatchling", "versioningit"]
3build-backend = "hatchling.build"
4
5[tool.hatch.build]
6artifacts = ["src/prefect/_build_info.py", "src/prefect/server/ui"]
7
8[tool.hatch.build.targets.sdist]
9include = ["/src/prefect", "/README.md", "/LICENSE", "/pyproject.toml"]
This consolidation has significantly simplified our build process. We no longer need to maintain separate files for different aspects of the build, and the declarative nature of TOML makes it much easier to understand and modify the configuration.
One piece of nuance here is our inclusion of src/prefect/server/ui in the sdist. This is a directory that is .gitignore'd, but is generated at UI build time. We include it in the sdist so that after installing prefect from PyPI users can run the dashboard with prefect server start.
Previously, tool configurations were also scattered across multiple files. In our case, we had:
Now (similar to the dependency declarations) they're all in one place:
1[tool.mypy]
2plugins = ["pydantic.mypy"]
3ignore_missing_imports = true
4follow_imports = "skip"
5python_version = "3.9"
6
7[tool.pytest.ini_options]
8testpaths = ["tests"]
9addopts = "-rfEs --mypy-only-local-stub"
10norecursedirs = ["*.egg-info"]
11python_files = ["test_*.py", "bench_*.py"]
12python_functions = ["test_*", "bench_*"]
13markers = [
14 "service(arg): a service integration test. For example 'docker'",
15 "clear_db: marker to clear the database after test completion",
16]
17
18[tool.ruff]
19...
20
21[tool.codespell]
22...
This consolidation makes it much easier to find and modify tooling configuration.
One of the most pleasant improvements we've seen from this migration is in our CI/CD process.
Previously, some or all of our CI pipelines had to:
Looking at our GitHub Actions workflows now, we've dramatically simplified dependency installation across all our test jobs. For example, the core of our python-tests.yaml is now just:
1jobs:
2 run-tests:
3 steps:
4 - name: Set up uv and Python ${{ matrix.python-version }}
5 uses: astral-sh/setup-uv@v5
6 with:
7 enable-cache: true
8 python-version: ${{ matrix.python-version }}
9 cache-dependency-glob: "pyproject.toml"
10
11 - name: Run tests
12 run: |
13 uv run pytest ${{ matrix.test-type.modules }} \
14 --numprocesses auto \
15 --maxprocesses 6 \
16 --dist worksteal \
17 --disable-docker-image-builds \
18 --exclude-service kubernetes \
19 --exclude-service docker \
20 --durations 26 \
All by itself, uv run will inspect the project dependencies, install the dev group by default (or say --no-dev if you want) and then run pytest according to our flags and pyproject.toml config for pytest.
We use slight variations of run and sync for scenarios having varying requirements:
It’s sufficient to say that uv just makes everything easier, but perhaps most significantly our workflow files are now just much cleaner - which makes things easier to read and maintain.
Compare the before:
1- name: Install dependencies
2 run: |
3 uv pip install ".[dev]"
4 uv pip install -r requirements-otel.txt
5 uv pip install -r requirements-markdown-tests.txt
To the after:
1- name: Install dependencies
2 run: uv sync --group markdown-docs --extra otel
💡 Using the top-level uv API allows us to more concisely and consistently install dependencies needed for different CI jobs.
This migration has delivered several concrete improvements:
For teams considering a similar migration, we recommend:
By embracing modern packaging standards and tools, we've not only simplified our configuration but also improved the development experience for our team and contributors.
Have any questions or believe there’s a mistake in this post? Get a hold of us on GitHub!
This has been a library-focused blog post, but check out this great YouTube video by Hynek Schlawack where he explains his app-focused approach to structuring projects with pyproject.toml and uv.