Data/Platform/Airflow Runbook: Difference between revisions

From MozillaWiki
< Data‎ | Platform
Jump to navigation Jump to search
(Whitespace)
Line 1: Line 1:
[https://github.com/mozilla/telemetry-airflow Airflow] is our workflow management system for telemetry batch jobs. The main project docs are [https://airflow.incubator.apache.org/ here]. This document describes the process for resolving issues when things go sideways.
[https://github.com/mozilla/telemetry-airflow Airflow] is our workflow management system for telemetry batch jobs. The main project docs are [https://airflow.incubator.apache.org/ here]. This document describes the process for resolving issues when things go sideways.


__TOC__
__TOC__


=== A DAG is running that I don't want to run ===
=== A DAG is running that I don't want to run ===

Revision as of 20:09, 19 January 2017

Airflow is our workflow management system for telemetry batch jobs. The main project docs are here. This document describes the process for resolving issues when things go sideways.



A DAG is running that I don't want to run

If you accidentally start DAG runs for dates that are either already processed or you're not interested in, the best course is often to mark the task(s) as `Success` from the web UI. To do this, click on the root task and, in the resulting modal dialog, click "Downstream" and then "Mark Success" to turn those task runs green.

Click "Downstream" and then "Mark Sucess" in the task modal dialog

This doesn't stop any actually currently running clusters, however, so find those running clusters on EMR and kill them.

I want to run a backfill

To run a backfill on a whole DAG, the easiest way is to click on the root task, select "Downstream" and click on "Clear".

ToDo: running a backfill on many days

My DAG isn't running on schedule

The most common cause of a DAG not running is when your DAG has `depends_on_past` set to `True` and there was a failure on a past DAG run. ToDo: fill in how to fix this.