Auto-tools/Projects/Signal From Noise/Execution2012: Difference between revisions

Jump to navigation Jump to search
Line 11: Line 11:


== Execution of Signal from Noise ==
== Execution of Signal from Noise ==
In general, it was initially expected that some back-of-the-envelope analysis of existing Talos numbers and some (again) back-of-the-envelope engineering implementations of statistics that at least appeared (though would probably be non-rigorous) less noisy would satisfy the goals of the Signal from Noise project for the time being, with optimistically ongoing effort being invested in analysis of performance data.  The initial SfN effort was scheduled for a single quarter.  This proved rather optimistic. (See also: http://k0s.org/mozilla/blog/20120829151007 )
In general, it was initially expected that some back-of-the-envelope analysis of existing Talos numbers and some (again) back-of-the-envelope engineering implementations of statistics that at least appeared (though would probably be non-rigorous) less noisy would satisfy the goals of the Signal from Noise project for the time being, with optimistically ongoing effort being invested in analysis of performance data.  The initial SfN effort was scheduled for a single quarter.  This proved rather optimistic. (See also: http://k0s.org/mozilla/blog/20120829151007 .)
In practice, because of the way averaging was split between talos and graphserver, it was effectively impossible to utilize the current system to develop more robust statistics.  It quickly became apparent that we needed a system that preserved the raw measurements in order to allow us to use and compare the fidelity of different statistical models on the Talos data.  The Talos test harness itself should not be crunching numbers: it should limit its role to measuring them and reporting them.
In practice, because of the way averaging was split between talos and graphserver, it was effectively impossible to utilize the current system to develop more robust statistics.  It quickly became apparent that we needed a system that preserved the raw measurements in order to allow us to use and compare the fidelity of different statistical models on the Talos data.  The Talos test harness itself should not be crunching numbers: it should limit its role to measuring them and reporting them.
Work was devoted to the creation of a graphserver replacement that would make it possible to perform regression and improvement detection per-push.  This is the datazilla project: https://wiki.mozilla.org/Auto-tools/Projects/Datazilla . The decision to write a new piece of infrastructure versus refining graphserver was not taken lightly. While for any new software there will be unknown (but non-incidental) sinks of time, there was little existing code in graphserver to use as a foundation for the problems we cared about solving.
Work was devoted to the creation of a graphserver replacement that would make it possible to perform regression and improvement detection per-push.  This is the datazilla project: https://wiki.mozilla.org/Auto-tools/Projects/Datazilla . The decision to write a new piece of infrastructure versus refining graphserver was not taken lightly. While for any new software there will be unknown (but non-incidental) sinks of time, there was little existing code in graphserver to use as a foundation for the problems we cared about solving.
Line 25: Line 25:
* Datazilla should be able to be scalable enough to accumulate data per-push and generate a "regression/improvement" analysis for that push in real time.
* Datazilla should be able to be scalable enough to accumulate data per-push and generate a "regression/improvement" analysis for that push in real time.
* The system should also provide a UI atop its own REST interfaces so that an interested developer can start on TBPL and drill into the results of a push. The developer should be able to drill all the way down to the raw replicate values for the page (i.e. each page is loaded some number of times, and you should be able to drill down to that level if you want to.)
* The system should also provide a UI atop its own REST interfaces so that an interested developer can start on TBPL and drill into the results of a push. The developer should be able to drill all the way down to the raw replicate values for the page (i.e. each page is loaded some number of times, and you should be able to drill down to that level if you want to.)
== Metrics ==
== Metrics ==
The Mozilla Metrics team, https://wiki.mozilla.org/Metrics , worked as part of Signal from Noise to audit our performance statistical methodology and help develop better models. Metrics looked at:
The Mozilla Metrics team, https://wiki.mozilla.org/Metrics , worked as part of Signal from Noise to audit our performance statistical methodology and help develop better models. Metrics looked at:
947

edits

Navigation menu