TestEngineering/Performance/Sheriffing: Difference between revisions

TestEngineering/Performance/Sheriffing (view source)

Revision as of 13:13, 6 August 2019

1,337 bytes removed , 6 August 2019

no edit summary

Davehunt

Confirmed users

2,197

edits

@@ Line 5: / Line 5: @@
 = What is an alert =
-As of January 2016, alerts are generated in [https://treeherder.mozilla.org/perf.html#/alerts?status=0&framework=1 Perfherder].  These are generated by programatically verifying there is a sustained regression over time ([https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Noise_FAQ#Why_do_we_need_12_future_data_pointsoriginal data point + 12 future data points]).
+As of January 2016, alerts are generated in [https://treeherder.mozilla.org/perf.html#/alerts?status=0&framework=1 Perfherder].  These are generated by programatically verifying there is a sustained regression over time ([[/Noise_FAQ#Why_do_we_need_12_future_data_points|original data point + 12 future data points]]).
 There is an alert summary outlining the alerts which match the same set of revisions.  For the summary there are a few pieces of information:
 * Title (which is a good bug title if filing one for a regression:
-** [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#Branch_names_and_confusion branch]
+** [[/Tree_FAQ#Branch_names_and_confusion|branch]]
 ** % regressed, this is a range of the regressions (not improvements)
-** the [https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests tests] which have regressed
+** the [[TestEngineering/Performance/Talos/Tests|tests]] which have regressed
 ** the platforms we see this regression on
 * date of the suspect revision push
@@ Line 19: / Line 19: @@
 Below the summary will be a list of alerts, each alert will reference:
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests Test name]
+* [[TestEngineering/Performance/Talos/Tests|Test name]]
 * platform (including build type, such as opt, pgo)
 * old score (median score of the previous 12 commits)
 * new score (median score of the future 12 commits)
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Alert_FAQ#Why_does_Alert_Manager_print_-xx.25 % change / values]
+* [[/Alert_FAQ#Why_does_Alert_Manager_print_-xx.25|% change / values]]
 * bar chart to show severity, green = improvement, red = regression
 * Confidence value (from the t-test code)
@@ Line 33: / Line 33: @@
 * Look at the graph and determine the original branch, date, revision where the alert occurred
 * Look at Treeherder and determine if we have all the data.
-* Retrigger jobs if needed (more [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Noise_FAQ#What_is_Noise noise], more retriggers)
+* Retrigger jobs if needed (more [[/Noise_FAQ#What_is_Noise|noise]], more retriggers)
 * Once you have more data, look at the data in [https://treeherder.mozilla.org/perf.html#/comparechooser compare view] to see if other tests/platforms have changed
 * Add all related alerts you see to the summary with the reassign button
 == Determining the root cause from Perfherder ==
-When viewing a single alert and clicking on the graph link,  Perfherder automatically show multiple branches for the given test/platform.  This helps you determine the root branch.  It is best to [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Perfherder_FAQ#Zooming zoom] in and out to verify where the regression is.
+When viewing a single alert and clicking on the graph link,  Perfherder automatically show multiple branches for the given test/platform.  This helps you determine the root branch.  It is best to [[/Perfherder_FAQ#Zooming|zoom]] in and out to verify where the regression is.
 While this isn't always clear, most of the time it is easy to see another alert on a different branch and mark the current one as a downstream if needed.
@@ Line 72: / Line 72: @@
 == Determining the scope of the regression from Perfherder ==
-Once you have the spot, you can validate the other platforms by [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Perfherder_FAQ#Adding_additional_data_points adding additional data sets] to the graph.  It is best here to zoom out a bit as the regression might be a few revisions off on different platforms due to [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_coalescing coalescing].
+Once you have the spot, you can validate the other platforms by [[/Perfherder_FAQ#Adding_additional_data_points|adding additional data sets]] to the graph.  It is best here to zoom out a bit as the regression might be a few revisions off on different platforms due to [[/Tree_FAQ#What_is_coalescing|coalescing]].
 == Cases to watch out for ==
 There are many reasons for an alert and different scenarios to be aware of:
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_a_backout backout] (usually within 1 week causing a similar regression/improvement)
+* [[/Tree_FAQ#What_is_a_backout|backout]] (usually within 1 week causing a similar regression/improvement)
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_PGO pgo/nonpgo] (some errors are pgo only and might be a side effect of pgo).  We only ship PGO, so these are the most important.
+* [[/Tree_FAQ#What_is_PGO|pgo/nonpgo]] (some errors are pgo only and might be a side effect of pgo).  We only ship PGO, so these are the most important.
 * test/infrastructure change - once in a while we change big things about our tests or infrastructure and it affects our tests (we need bugs to document these those)
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_a_merge Merged] - sometimes the root cause looks to be a merge, this is a normall a side effect of [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_coalescing Coalescing].
+* [[/Tree_FAQ#What_is_a_merge|Merged]] - sometimes the root cause looks to be a merge, this is a normall a side effect of [[/Tree_FAQ#What_is_coalescing|Coalescing]].
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_coalescing Coalesed] - this is when we don't run every job on every platform on every push and sometimes we have a set of changes
+* [[/Tree_FAQ#What_is_coalescing|Coalesed]] - this is when we don't run every job on every platform on every push and sometimes we have a set of changes
 * Regular regression - the normal case where we get an alert and we see it merge from branch to branch
@@ Line 86: / Line 86: @@
 Every release of Firefox we create a tracking bug (i.e. {{bug|1386631}} - Firefox 57) which we use to associate all regressions found during that release.  The reason for this is 2 fold:
 * We can go to one spot and see what regressions we have for reference on new bugs or to follow up.
-* When we [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_an_uplift uplift] it is important to see which alerts we are expecting
+* When we [[/Tree_FAQ#What_is_an_uplift|uplift]] it is important to see which alerts we are expecting
 These bugs just contain a set of links to other bugs, no conversation is needed.
@@ Line 97: / Line 97: @@
 Here are some things to check/verify when filing a bug:
 * Product/Component - this should be the same as the bug which is the root cause, if >1 bug, file in [https://bugzilla.mozilla.org/enter_bug.cgi?product=Testing&component=Talos Talos]
-* Dependent/Block bugs - For a new bug, add the [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing#Tracking_bugs tracking bug] (for the current version) and root cause bug(s) as blocking this bug
+* Dependent/Block bugs - For a new bug, add the [[#Tracking_bugs|tracking bug]] (for the current version) and root cause bug(s) as blocking this bug
-* CC list - cc patch author(s), reviewer(s) and owner of the tests as documented on the [https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests Talos tests wiki]; if we have >1 bug, we should cc everyone who worked on those bugs so we call pitch in an answer questions faster
+* CC list - cc patch author(s), reviewer(s) and owner of the tests as documented on the [[TestEngineering/Performance/Talos/Tests|Talos tests wiki]]; if we have >1 bug, we should cc everyone who worked on those bugs so we call pitch in an answer questions faster
 * Summary of bug should have a check to make sure the revision is accurate
 * The description is auto suggested as well, please verify the revision here
-As a note, the generated description refers the patch author to [https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling guidelines and expectations] for them about how and when to respond.
+As a note, the generated description refers the patch author to [[TestEngineering/Performance/Talos/RegressionBugsHandling|guidelines and expectations]] for them about how and when to respond.
 Once a bug is filed it is a good idea to do a few things in another comment:
 * provide a link to compare view to show you have done retriggers and believe this is valid
-* needinfo the patch author (if many patch authors, needinfo one of :jmaher, :igoldan or :rwood)
+* needinfo the patch author (if many patch authors, needinfo one of :davehunt, :igoldan or :rwood)
 * mention how confident you are in the regression (more confidence if you have a lot of retriggers and there is only one patch, less confident if you are waiting on backfilling data, retriggers, try runs, etc.)
@@ Line 113: / Line 113: @@
 == Merge Day - Uplifts ==
-Every 6 weeks we do an [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ#What_is_an_uplift uplift].  These typically result in [https://elvis314.wordpress.com/2014/12/12/tracking-firefox-performance-as-we-uplift-the-volume-of-alerts-we-get/ dozens of alerts] for each uplift.
+Every 6 weeks we do an [[/Tree_FAQ#What_is_an_uplift|uplift]].  These typically result in [https://elvis314.wordpress.com/2014/12/12/tracking-firefox-performance-as-we-uplift-the-volume-of-alerts-we-get/ dozens of alerts] for each uplift.
 The job here is to triage alerts as we usually do, except in this case we have a much larger volume of alerts.  One thing here is we have alerts from the upstream branch.  Take for example when we uplift Mozilla-Central to Mozilla-Beta.  We have a tracking bug for each release, and there is a list of bugs (keep in mind some are resolved as wontfix).  In a perfect world (half the time) we can match up the alerts that are showing up on Mozilla-Beta with the bugs that have already been filed.  The job here is to verify and add bugs to keep track of what is there.
@@ Line 125: / Line 125: @@
 = Additional Resources =
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Alert_FAQ Alert FAQ]
+* [[/Alert_FAQ|Alert FAQ]]
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Noise_FAQ Noise FAQ]
+* [[/Noise_FAQ|Noise FAQ]]
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Perfherder_FAQ Perfherder FAQ]
+* [[/Perfherder_FAQ|Perfherder FAQ]]
-* [https://wiki.mozilla.org/Performance_sheriffing/Talos/Sheriffing/Tree_FAQ Tree FAQ]
+* [[/Tree_FAQ|Tree FAQ]]
-* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing duplicated & updated from old page]

TestEngineering/Performance/Sheriffing: Difference between revisions

TestEngineering/Performance/Sheriffing (view source)

Revision as of 13:13, 6 August 2019

Navigation menu

Search