Introducing the dbt Orchestrator: Taking the wheel of your dbt DAG
The dbt Orchestrator is currently in open beta.
Right now, in a warehouse you're paying for, a perfectly healthy dbt model is rebuilding itself. Nothing about it changed. Nobody asked.
Beginning today, the Prefect dbt Orchestrator executes your dbt graph model by model. With state, retries, and cache-awareness attached to every node.
Stop paying for the same SQL twice
Run dbt twice with the same code and it builds the staging layer twice. It sees a command and starts from zero.
By executing nodes individually, we can add state-aware caching. We hash the SQL, the config, and the dependencies for every node. If the hash matches a previous run, Prefect skips the work. For teams at scale, this stops the compounding costs of redundant warehouse compute. One of our customers estimated this would shave 30% off their annual Snowflake bill by refusing to repeat work that is already finished.
Parallelism without the "Pod Tax"
Parallelizing dbt usually involves a frustrating trade-off: run everything in one process and lose all visibility, or spin up a new Kubernetes pod for every single model.
Call it the Pod Tax: you spend more time waiting for containers to start than running SQL. If a model takes 10 seconds to run but the pod takes 60 seconds to spin up, your pipeline is 85% infrastructure overhead.
We use native process pools to parallelize execution within shared environments. You get the speed of concurrency without the scheduling latency or the container churn of pod-per-task architectures.
Durable recovery
When every dbt node is a real Prefect task, execution becomes durable. You get orchestration primitives a standard CLI run lacks:
- Retry the node: If a staging table flakes, Prefect retries only that node. The rest of the build keeps moving.
- Smart Skipping: If a model fails, Prefect marks only its downstream dependents as skipped. Independent branches of the DAG continue to execute.
- Targeted Logs: The Prefect UI surfaces the compiled SQL and logs for the specific model that failed.
from prefect import flow
from prefect_dbt.core.settings import PrefectDbtSettings
from prefect_dbt.core._orchestrator import (
PrefectDbtOrchestrator,
ExecutionMode,
)
@flow
def run_dbt_build():
settings = PrefectDbtSettings(project_dir="./my_dbt_project")
orchestrator = PrefectDbtOrchestrator(
settings=settings,
execution_mode=ExecutionMode.PER_NODE,
retries=3,
retry_delay_seconds=60,
)
return orchestrator.run_build(select="tag:daily")Simplify your stack
Most teams end up managing two different systems: Prefect for ingestion and a separate managed service or custom setup just for dbt. That means managing two schedules, two sets of alerts, and two different retry behaviors for a single data pipeline.
Node-level execution means your dbt models use the same concurrency limits, retries, and failure alerts as the rest of your stack. You can manage the entire pipeline in one place without needing a second scheduler to fill the gaps we left behind.
Getting started
The dbt Orchestrator is available today in open beta. To get started, review the documentation and install the correct prefect-dbt package.
If you have suggestions or requests for what you'd like to see, check out our GitHub discussions. Our Community Slack is also a great place to connect with others and get help with your workflows.