From MozillaWiki
Jump to: navigation, search


The cloud services data pipeline ingests data for analysis, monitoring and reporting. The pipeline is currently used for processing desktop and device Telemetry data and cloud services server logs. The ingestion pipeline is one component of the Fx Data Platform.

Pipeline specs/docs

Data sets and other documentation


V2 Pipeline

Link Description Mozilla Services Data Pipeline Generic Lua sandbox for dynamic data analysis JSON Schema specifications of pipeline data Monitoring data quality issues for metrics pipeline Data collection and processing made easy HTTP Data Pipeline Ingestion Data collection and processing made light weight, fast, and more reliable


Link Description Slides / notebooks for Telemetry Onboarding Code for among other things A dashboard to track the deployment of Firefox Telemetry Experiments A Scala framework to build derived datasets, aka batch views, of Telemetry data. Automatic alert system for telemetry histograms AWS bootstrap scripts for Mozilla's flavoured Spark setup. Crash Rate Aggregation code Plugin to create, list, and load GitHub Gists from Jupyter notebooks Jupyter Notebook extension for Apache Spark integration Aggregator job for Spark bindings for Mozilla Telemetry Eventual home of the revamped a.t.m.o (per Bug 1248688) Scheduling / workflow management for Telemetry jobs Data analysis relating to Electrolysis / E10s Utility code to work with Mozilla Telemetry data