Airflow dags and PYTHONPATH

sebastian picture sebastian · Jun 6, 2018 · Viewed 7.5k times · Source

I have some dags that can't seem to locate python modules. Inside of the Airflow UI, I see a ton of these message variations.

Broken DAG: [/home/airflow/source/airflow/dags/test.py] No module named 'paramiko'

Inside of a file I can directly modify the python sys.path and that seems to mitigate my issue. import sys sys.path.append('/home/airflow/.local/lib/python2.7/site-packages')

That doesn't feel right though having to set my path in my code directly. I've tried exporting PYTHONPATH in the Airflow user accounts .bashrc but doesn't seem to be read when the dag jobs are executed. What's the correct way to go about this?

Thanks.

----- update -----

Thanks for the responses.

below is my systemctl scripts.

::::::::::::::
airflow-scheduler-airflow2.service
::::::::::::::
[Unit]
Description=Airflow scheduler daemon

[Service]
EnvironmentFile=/usr/local/airflow/instances/airflow2/etc/envars
User=airflow2
Group=airflow2
Type=simple
ExecStart=/usr/local/airflow/instances/airflow2/venv/bin/airflow scheduler
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target
::::::::::::::
airflow-webserver-airflow2.service
::::::::::::::
[Unit]
Description=Airflow webserver daemon

[Service]
EnvironmentFile=/usr/local/airflow/instances/airflow2/etc/envars
User=airflow2
Group=airflow2
Type=simple
ExecStart=/usr/local/airflow/instances/airflow2/venv/bin/airflow webserver
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target

this is the EnvironentFile Contents uses from above

more /usr/local/airflow/instances/airflow2/etc/envars
PATH=/usr/local/airflow/instances/airflow2/venv/bin:/usr/local/bin:/usr/bin:/bin
AIRFLOW_HOME=/usr/local/airflow/instances/airflow2/home
AIRFLOW_CONFIG=/usr/local/airflow/instances/airflow2/etc/airflow.cfg

Answer

Andrey picture Andrey · Jun 4, 2019

I had similar issue:

  1. Python wasn't loaded from virtualenv for running airflow (this fixed airflow deps not being fetched from virtualenv)
  2. Submodules under dags path wasn't loaded due different base path (this fixed importing own modules under dags folder

I added following strings to the environemnt file for systemd service (/usr/local/airflow/instances/airflow2/etc/envars in your case)

source /home/ubuntu/venv/airflow/bin/activate
PYTHONPATH=/home/ubuntu/venv/airflow/dags:$PYTHONPATH