Auto-tools/Projects/Signal From Noise/Execution2012: Difference between revisions

Jump to navigation Jump to search
→‎Execution of Signal from Noise: more paragraphy and rewording
(→‎Execution of Signal from Noise: more paragraphy and rewording)
Line 14: Line 14:
In general, it was initially expected that some back-of-the-envelope analysis of existing Talos numbers and some (again) back-of-the-envelope engineering implementations of statistics that at least appeared (though non-rigorously) less noisy would satisfy the goals of the Signal from Noise project for the time being, with optimistically ongoing effort being invested in analysis of performance data.  The initial SfN effort was scheduled for a single quarter.  This proved rather optimistic. (See also: http://k0s.org/mozilla/blog/20120829151007 .)
In general, it was initially expected that some back-of-the-envelope analysis of existing Talos numbers and some (again) back-of-the-envelope engineering implementations of statistics that at least appeared (though non-rigorously) less noisy would satisfy the goals of the Signal from Noise project for the time being, with optimistically ongoing effort being invested in analysis of performance data.  The initial SfN effort was scheduled for a single quarter.  This proved rather optimistic. (See also: http://k0s.org/mozilla/blog/20120829151007 .)


In practice, because of the way averaging was split between talos and graphserver, it was effectively impossible to utilize the current system to develop more robust statistics.  It quickly became apparent that we needed a system that preserved the raw measurements in order to allow us to use and compare the fidelity of different statistical models on the Talos data.  The Talos test harness itself should not be crunching numbers: it should limit its role to measuring them and reporting them.
In practice, because of the way averaging was split between talos and graphserver, it was effectively impossible to utilize the current system to develop more robust statistics.  It quickly became apparent that we needed a system that preserved the raw measurements in order to allow us to use and compare the fidelity of different statistical models on the Talos data.  The Talos test harness itself should not be crunching numbers: its role is measuring and reporting.
 
Work was devoted to the creation of a graphserver replacement that would make it possible to perform regression and improvement detection per-push.  This is the datazilla project: https://wiki.mozilla.org/Auto-tools/Projects/Datazilla . The decision to write a new piece of infrastructure versus refining graphserver was not taken lightly. While for any new software there will be unknown (but non-incidental) sinks of time, there was little existing code in graphserver to use as a foundation for the problems we cared about solving.
Work was devoted to the creation of a graphserver replacement that would make it possible to perform regression and improvement detection per-push.  This is the datazilla project: https://wiki.mozilla.org/Auto-tools/Projects/Datazilla . The decision to write a new piece of infrastructure versus refining graphserver was not taken lightly. While for any new software there will be unknown (but non-incidental) sinks of time, there was little existing code in graphserver to use as a foundation for the problems we cared about solving.
As  of Q2-2012, in order to make deployment more dynamic and configurable  as well as to enable the installation of Talos dependencies, an effort  was undertaken parallel though tied to the Signal from Noise project to  get Talos in production on mozharness: http://escapewindow.com/mozharness/
As  of Q2-2012, in order to make deployment more dynamic and configurable  as well as to enable the installation of Talos dependencies, an effort  was undertaken parallel though tied to the Signal from Noise project to  get Talos in production on mozharness: http://escapewindow.com/mozharness/
=== Problems We Aimed to Solve with Datazilla ===
=== Problems We Aimed to Solve with Datazilla ===
* Preserve and capture raw performance numbers. The Talos test framework is a bad place to do statistics, because if you do any averaging before uploading the results then the ability to retrieve the original data is forever lost.  Instead, datazilla should take in all raw values from talos and provide a central platform for regression/improvement detection and statistical study
* Preserve and capture raw performance numbers. The Talos test framework is a bad place to do statistics, because if you do any averaging before uploading the results then the ability to retrieve the original data is forever lost.  Instead, datazilla should take in all raw values from talos and provide a central platform for regression/improvement detection and statistical study
947

edits

Navigation menu