
blog
The Prefect Task Library
The Prefect community is a welcoming place for developers of all experience levels, and our task library is an excellent way to get involved with contributing to an open-source project.

Josh Meek
November 8, 2020
The Prefect task library is a constantly growing collection of pre-defined tasks that provide off-the-shelf functionality for working with a wide range of tools, ranging from shell script execution to Kubernetes Job management to sending tweets. A majority of the task library is community supported and thus opens the door for users who want to contribute new tasks or expand the functionality of existing tasks. Tasks in the task library are typically created with a specific goal in mind such as creating a Kubernetes Job with
CreateNamespacedJob
or invoking an AWS lambda function with LambdaInvoke
.Above is a table showcasing some of the tasks that have been contributed to the task library for interfacing with various tools and services that users have deemed useful. For a full list of tasks in the library and more information on how to use them visit the API reference documentation for the
prefect.tasks
module.Committing to Prefect’s task library is a safe place for new users to learn the ins and outs contributing to an open source project as well as a great way to assist in open source development! Developers from all skill levels are accepted in contributing to Prefect’s task library and we are more than happy to guide users through the process. The Prefect community is designed for collaboration opportunities for developers to discuss, implement, and maintain the growing list of Prefect integrations. Whether it’s in the Prefect Community Slack or directly on the GitHub repo, all community discussions happen in the open, visible to all.
There are a few key reasons why users contribute tasks to the task library:
- Gain experience contributing to an open source project
- Increase adoption for libraries, tools, and frameworks by making an easy route for users of Prefect to interact with them
- Allow for tasks to evolve with Prefect meaning that as paradigms and abstractions change in Prefect the task in the open source library will change with it
- Open up collaboration to thousands of other developers who could use your task (they might fix bugs in the task you weren’t aware of!)
Not to mention that we occasionally send Prefect swag to some of our open source contributors!
Task Library in Action
Just like any other Prefect task, tasks in the task library can be used by importing, initializing and adding them to your flow. The lifecycle of a task can be confusing for users who are not used to deferred computation, so for the sake of clarity let’s review the steps involved in getting a task into a Prefect flow and running it:
- Define: this is the first and most important step in a task’s lifecycle — defining what it does! This is also the step that is most critical to contributors of a new task, and the one we will focus on for the rest of this post.
- Initialize: Users of your task will first need to initialize, or instantiate, the task definition into a
Task
instance. This is a common place to specify static configuration of your task — things like the task name, default values, etc. Note that all information provided at initialization must be known prior to running your flow.
- Bind:Prefect tasks are most interesting when considered in relation to other tasks — these relationships are managed and tracked by a Prefect flow. There are two ways to bind a task to a flow: by “calling” the task (see examples below), or by using Prefect’s imperative API and explicitly adding the task to your flow object with its associated dependencies.
- Run: The goal of all of this is ultimately to run the task within the context of a flow. This is always handled for you when you call
flow.run
or a flow run is triggered via a Prefect backend, taking all triggers and state handler logic into account.
The popular
@task
decorator handles steps 1 and 2 simultaneously: the function you decorate defines your task’s runtime logic, and all keywords passed into the decorator are used when initializing the task!from prefect import task, Flow from prefect.tasks.shell import ShellTask
ls_task = ShellTask(command="ls", return_all=True)
@task
def show_output(std_out):
print(std_out)
with Flow("list_files") as flow:
ls = ls_task()
show_output(ls)
Most keyword arguments for tasks imported from the task library can either be set at initialization for reuse purposes or optionally set and overwritten when defining the flow.
from prefect import task, Flow from prefect.tasks.shell import ShellTask
# Will only return the listed files
ls_task = ShellTask(command="ls", return_all=True)
@task
def show_output(std_out):
print(std_out)
with Flow("count_files") as flow:
ls = ls_task()
show_output(ls)
# Override command to count listed files
ls_count = ls_task(command="ls | wc -l")
show_output(ls_count)
Tasks in Prefect take a subclass approach that allows users to provide a configurable task “template” meaning that default values can be both set at initialization and optionally overwritten at runtime. Take the following task as an example:
class MyTask(Task): def __init__(self, val = None, **kwargs): self.val = val super().__init__(**kwargs)
def run(self, val = None):
print(self.val or val)
my_task = MyTask(val=42)
with Flow("task-with-default") as flow:
t1 = my_task()
t2 = my_task(val=100)
An instance of
MyTask
is initialized before the definition of the flow and within the flow context that task is copied twice to create two tasks. The first task uses the default value of 42
and the second task overrides the value to set 100
. This pattern was chosen in order to avoid having to re-initialize the task every time it is needed in the flow. The snippet above is effectively equivalent to the following:class MyTask(Task): def __init__(self, val = None, **kwargs): self.val = val super().__init__(**kwargs)
def run(self, val = None):
print(self.val or val)
with Flow("task-with-default") as flow:
t1 = MyTask(val=42)()
t2 = MyTask()(val=100)
Notice above that
MyTask
is instantiated and called two times inside the definition of the flow. The first set of parenthesis are used for initializing the task and the second are for actually passing run information to the task. Sometimes users will attempt to pass values from upstream tasks to a downstream’s initialization function instead of the call to run. That is not possible because the results from upstream tasks are not returned until the task actually runs, therefore it needs to be passed to the call to run:@task def get_value_1(): return 100