Antoine Krajnc
Antoine Krajnc

Reputation: 1323

Why use CustomOperator over PythonOperator in Apache Airflow?

As I'm using Apache Airflow, I can't seem to find why someone would create a CustomOperator over a PythonOperator. Wouldn't it lead to the same results if I'm using a python function inside a PythonOperator instead of a CustomOperator?

If someone would know what are the different use cases and best practices, that would be great! !

Thanks a lot for your help

Upvotes: 12

Views: 1587

Answers (1)

Victor Kofia
Victor Kofia

Reputation: 431

Both operators while similar are really at different abstraction levels, and depending on your use-case, one may be a better fit than another.

Code defined in a CustomOperator can be easily used by multiple DAGs. If you have a lot of DAGs that need to perform the same task it may make more sense to expose this code to the DAGs via a CustomOperator.

PythonOperator is very general and is a better fit for one-off DAG specific tasks.

At the end of the day the default set of operators provided in Airflow are just tools. Which tool you end up using (default operators) or whether it makes sense to create your own custom tool (custom operators) is a choice determined by a bunch of factors:

  1. The type of task you are trying to accomplish.
  2. Code organization requirements necessitated by policy or the number of people maintaining the pipeline.
  3. Simple personal taste.

Upvotes: 12

Related Questions