Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
Apache Airflow job is powered by Apache Airflow.
A Python package lets you organize related Python modules into a single directory hierarchy. A package is typically represented as a directory that contains a special file called init.py. Inside a package directory, you can have multiple Python module files (.py files) that define functions, classes, and variables. With Apache Airflow Jobs, you can develop your own private packages to add custom Apache Airflow operators, hooks, sensors, plugins, and more.
In this tutorial, you'll build a simple custom operator as a Python package, add it as a requirement in your Apache Airflow job, and import your private package as a module in your DAG file.
Develop a custom operator and test with an Apache Airflow Dag
Create a file called
sample_operator.pyand turn it into a private package. If you need help, check out this guide: Creating a package in pythonfrom airflow.models.baseoperator import BaseOperator class SampleOperator(BaseOperator): def __init__(self, name: str, **kwargs) -> None: super().__init__(**kwargs) self.name = name def execute(self, context): message = f"Hello {self.name}" return messageNext, create an Apache Airflow DAG file called
sample_dag.pyto test the operator you made in the first step.from datetime import datetime from airflow import DAG # Import from private package from airflow_operator.sample_operator import SampleOperator with DAG( "test-custom-package", tags=["example"] description="A simple tutorial DAG", schedule_interval=None, start_date=datetime(2021, 1, 1), ) as dag: task = SampleOperator(task_id="sample-task", name="foo_bar") taskSet up a GitHub Repository with your
sample_dag.pyfile inDagsfolder, along with your private package file. You can use formats likezip,.whl, ortar.gz. Put the file in either the 'Dags' or 'Plugins' folder, whichever fits best. Connect your Git Repository to your Apache Airflow Job, or try the ready-made example at Install-Private-Package.
Add your package as a requirement
Add the package under Airflow requirements using the format /opt/airflow/git/<repoName>/<pathToPrivatePackage>
For example, if your private package sits at /dags/test/private.whl in your GitHub repo, just add /opt/airflow/git/<repoName>/dags/test/private.whl to your Airflow environment.