Auto-tools/Projects/Signal From Noise/StatusNovember2011: Difference between revisions

Jump to navigation Jump to search
→‎State of Statistics, November 2011: this isn't actually a sentence
(→‎State of Statistics, November 2011: this isn't actually a sentence)
Line 56: Line 56:
   To determine whether a good point is "good" or "bad", we take 20-30 points of historical data, and 5 points of future data.  We compare these using a t-test.  See https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf#page=74 . Regressions are mailed to the dev-tree-management mailing list.  Regressions are calculated by the analyze_talos.py script which uses a configuration file based on http://hg.mozilla.org/graphs/file/tip/server/analysis/analysis.cfg.template
   To determine whether a good point is "good" or "bad", we take 20-30 points of historical data, and 5 points of future data.  We compare these using a t-test.  See https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf#page=74 . Regressions are mailed to the dev-tree-management mailing list.  Regressions are calculated by the analyze_talos.py script which uses a configuration file based on http://hg.mozilla.org/graphs/file/tip/server/analysis/analysis.cfg.template


(From https://wiki.mozilla.org/Buildbot/Talos#Regressions .)
''(from https://wiki.mozilla.org/Buildbot/Talos#Regressions)''


In practice a high amount of noise and false positives are observed with respect to regression or improvement detections. https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf#page=74 points out the general methodology used by this script and statistical shortcomings and potentially faulty assumptions going into it.  One notable violation of assumptions is that the t-test used assumes a normal distribution which we know for a fact not to be true (as documented elsewhere in the thesis).
In practice a high amount of noise and false positives are observed with respect to regression or improvement detections. https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf#page=74 points out the general methodology used by this script and statistical shortcomings and potentially faulty assumptions going into it.  One notable violation of assumptions is that the t-test used assumes a normal distribution which we know for a fact not to be true (as documented elsewhere in the thesis).
947

edits

Navigation menu