Confirmed users
3,376
edits
m (→Filtering & Calculations: - add more content.) |
m (→Perfherder: - added filter definitions) |
||
| Line 24: | Line 24: | ||
We do filtering on the replicates, mainly because the first few replicates are not a representative sample of the remaining replicates we collect. The one exception would be [https://wiki.mozilla.org/Buildbot/Talos/Tests#Internal_Benchmarks internal benchmarks] (generally suites which measure something other than time). For Benchmarks, there is usually a special formula applied to the replicates. | We do filtering on the replicates, mainly because the first few replicates are not a representative sample of the remaining replicates we collect. The one exception would be [https://wiki.mozilla.org/Buildbot/Talos/Tests#Internal_Benchmarks internal benchmarks] (generally suites which measure something other than time). For Benchmarks, there is usually a special formula applied to the replicates. | ||
== Subtest Filters == | |||
We have a variety of [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py filters] defined for Talos. I will explain what each filter is, and you can see the exact settings used for each filter by looking at the individual [https://wiki.mozilla.org/index.php?title=Buildbot/Talos/Tests tests]. | |||
=== ignore_first === | |||
This filter ignores the first 'X' replicates allowing us to ignore warmup runs. | |||
* input: an array of subtest replicates | |||
* returns: an array of replicates | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l127 filter.py] | |||
* used in most tests with X=1,2,5 (5 is the normal case) | |||
=== median === | |||
This filter takes in an array of replicates and returns the median of the replicates (a single value). | |||
* input: an array of subtest replicates | |||
* returns: a single value | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l58 filter.py] | |||
* used in most tests | |||
=== mean === | |||
This filter takes in an array of replicates and returns the mean value of the replicates (a single value). | |||
* input: an array of subtest replicates | |||
* returns: a single value | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l50 filter.py] | |||
* used in kraken for subtests | |||
=== dromaeo === | |||
This filter is a specific filter defined by dromaeo and respects the replicates as every 5 replicates represents a different metric being measured. | |||
* input: an array of dromaeo (DOM|CSS) subtest replicates | |||
* returns: a single number (geometric_mean of the metric sumarization) | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l92 filter.py] | |||
* used in dromaeo_dom and dromaeo_css to build a single value for the subtests | |||
=== v8_subtest === | |||
* input: an array of v8_7 subtest replicates | |||
* returns: a single value representing the benchmark weighted score for the subtest | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l168 filter.py] | |||
* used in v8_7 for the subtests | |||
NOTE: this deviates from the exact definition of v8 as we retain the Encrypt and Decrypt as subtests (instead of combining them into Crypto) as well as keeping Earley and Boyer (instead of combining them into EarleyBoyer). There is a slight tweak in the final suite score, but it is <1% different. | |||
== Suite Summarization Filters == | |||
Once we have a single number from each of the subtests, we need to generate a single number for the suite. There are 4 specific calculations used. | |||
=== geometric_mean === | |||
This is a standard geometric mean of the data: | |||
* inputs: array of subtest summarized data points (one point per subtest) | |||
* returns: a single value representing the geometric mean of all the subtests | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/filter.py#l114 filter.py] | |||
* used for most tests | |||
=== v8_metric === | |||
this is a custom metric which take the geometric_mean of the subtests and multiplies it by 100. | |||
* inputs: array of v8 subtest summaries | |||
* returns: a single v8 score | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/output.py#l102 output.py] | |||
* used for v8 version 7 only | |||
=== Canvasmark_metric === | |||
This is the metric used to calculate the Canvasmark score from the subtest summarized results. Essentially it is a sum of the subtests. | |||
* inputs: array of Canvasmark subtest results | |||
* returns: a single Canvasmark score | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/output.py#l115 output.py] | |||
* used for Canvasmark only | |||
=== js_metric === | |||
This is the metric used to calculate the Kraken score from the subtest summarized results. Essentially it is a sum of the subtests. | |||
* inputs: array of Kraken subtest results | |||
* returns: a single Kraken score | |||
* source: [http://hg.mozilla.org/build/talos/file/tip/talos/output.py#l108 output.py] | |||
* used for Kraken only | |||
== Perfherder == | == Perfherder == | ||