Airflow scheduler is slow to schedule subsequent tasks

Prasann picture Prasann · Nov 23, 2017 · Viewed 13.3k times · Source

When I try to run a DAG in Airflow 1.8.0 I find that it takes a lot of time between the time of completion predecessor task and the time at which the successor task is picked up for execution (usually greater the execution times of individual tasks). The same is the scenario for Sequential, Local and Celery Executors. Is there a way to lessen the overhead time mentioned? (like any parameters in airflow.cfg that can speed up the DAG execution?) Gantt chart has been added for reference: Gantt chart

Answer

Marcos Bernardelli picture Marcos Bernardelli · Jan 30, 2018

As Nick said, Airflow is not a real-time tool. Tasks are scheduled and executed ASAP, but the next Task will never run immediately after the last one.

When you have more than ~100 DAGs with ~3 Tasks in each one or Dags with many Tasks (~100 or more), you have to consider 3 things:

  1. Increase the number of threads that the DagFileProcessorManager will use to load and execute the Dags (airflow.cfg):

[scheduler]

max_threads = 2

The max_threads controls how many DAGs are picked and executed/terminated (see here).

Increasing this configuration may reduce the time between the Tasks.

  1. Monitor your Airflow Database to see if it has any bottlenecks. The Airflow database is used to manage and execute processes:

Recently we were suffering with the same problem. The time between Tasks was ~10-15 minutes, we were using PostgreSQL on AWS.

The instance was not using the resources very well; ~20 IOPS, 20% of the memory and ~10% of CPU, but Airflow was very slow.

After looking at the database performance using PgHero, we discovered that even a query using an Index on a small table was spending more than one second.

So we increased the Database size, and Airflow is now running as fast as a rocket. :)

  1. To get the time Airflow is spending loading Dags, run the command:

airflow list_dags -r

DagBag parsing time: 7.9497220000000075

If the DagBag parsing time is higher than ~5 minutes, it could be an issue.

All of this helped us to run Airflow faster. I really advise you to upgrade to version 1.9 as there are many performance issues that were fixed on this version

BTW, we are using the Airflow master in production, with LocalExecutor and PostgreSQL as the metadata database.