Right now, nodes in my DAG proceeds to the next day's task before the rest of the nodes of that DAG finishes. Is there a way for it to wait for the rest of the DAG to finish before moving unto the next day's DAG cycle?
(I do have depends_on_past as true, but that does not work in this case)
My DAG looks like this:
O
l
V
O -> O -> O -> O -> O
Also, tree view pic of the dag]
Might be a bit late for this answer, but I ran into the same issue and the way I resolved it is I added two extra tasks in each dag. "Previous" at the start and "Complete" at the end. Previous task is external task sensor which monitors previous job. Complete is just a dummy operator. Lets say it runs every 30 minutes so the dag would look like this:
dag = DAG(dag_id='TEST_DAG', default_args=default_args, schedule_interval=timedelta(minutes=30))
PREVIOUS = ExternalTaskSensor(
task_id='Previous_Run',
external_dag_id='TEST_DAG',
external_task_id='All_Tasks_Completed',
allowed_states=['success'],
execution_delta=timedelta(minutes=30),
dag=DAG
)
T1 = BashOperator(
task_id='TASK_01',
bash_command='echo "Hello World from Task 1"',
dag=dag
)
COMPLETE = DummyOperator(
task_id='All_Tasks_Completed',
dag=DAG
)
PREVIOUS >> T1 >> COMPLETE
So the next dag, even tho it will come into the queue, it will not let tasks run until PREVIOUS is completed.