Data/Platform/Airflow Runbook

From MozillaWiki
< Data‎ | Platform
Jump to: navigation, search

Airflow is our workflow management system for telemetry batch jobs. The main project docs are here. We're currently running airflow version 1.7.1.3 (the latest version available on PyPI.) This document describes the process for resolving issues when things go sideways.



A DAG is running that I don't want to run

If you accidentally start DAG runs for dates that are either already processed or you're not interested in, the best course is often to mark the task(s) as `Success` from the web UI. To do this, click on the root task and, in the resulting modal dialog, click "Downstream" and then "Mark Success" to turn those task runs green.

Click "Downstream" and then "Mark Sucess" in the task modal dialog

This doesn't stop any actually currently running clusters, however, so find those running clusters on EMR and kill them.

I want to run a backfill

To run a backfill on a whole DAG, the easiest way is to click on the root task, select "Downstream" and click on "Clear".

ToDo: running a backfill on many days

My DAG isn't running on schedule

The most common cause of a DAG not running is when your DAG has `depends_on_past` set to `True` and there was a failure on a past DAG run. ToDo: fill in how to fix this.