Confirmed users
2,197
edits
(various clean ups and updates) |
(Redirected page to TestEngineering/Performance/Sheriffing) |
||
| (11 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
#REDIRECT [[TestEngineering/Performance/Sheriffing]] | |||
= Overview = | = Overview = | ||
The code sheriff team does a great job of finding regressions in unittests and getting fixes for them or backing stuff out. This keeps our trees green and usable while thousands of checkins a month take place! | The code sheriff team does a great job of finding regressions in unittests and getting fixes for them or backing stuff out. This keeps our trees green and usable while thousands of checkins a month take place! | ||
| Line 37: | Line 39: | ||
* Add all related alerts you see to the summary with the reassign button | * Add all related alerts you see to the summary with the reassign button | ||
== Determining the root cause from | == Determining the root cause from Perfherder == | ||
When viewing a single alert and clicking on the graph link, Perfherder automatically show multiple branches for the given test/platform. This helps you determine the root branch. It is best to [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/ | When viewing a single alert and clicking on the graph link, Perfherder automatically show multiple branches for the given test/platform. This helps you determine the root branch. It is best to [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Perfherder_FAQ#Zooming zoom] in and out to verify where the regression is. | ||
While this isn't always clear, most of the time it is easy to see another alert on a different branch and mark the current one as a downstream if needed. | While this isn't always clear, most of the time it is easy to see another alert on a different branch and mark the current one as a downstream if needed. | ||
In rare cases we do not generate an alert on the original branch and then we would want to manually create an alert, then mark the first alert you were looking at as downstream to the new alert. | In rare cases we do not generate an alert on the original branch and then we would want to manually create an alert, then mark the first alert you were looking at as downstream to the new alert. | ||
== Determining if we have all the data from Treeherder == | == Determining if we have all the data from Treeherder == | ||
| Line 58: | Line 60: | ||
* an alert where the range of the regression overlaps with the regular range (a small alert (<5%) or a noisy test) | * an alert where the range of the regression overlaps with the regular range (a small alert (<5%) or a noisy test) | ||
We need to do some retriggers. I usually find it useful to retrigger 3 times on 5 | We need to do some retriggers. I usually find it useful to retrigger 3 times on 5 revisions: | ||
* target revision-2 | * target revision-2 | ||
* target revision-1 | * target revision-1 | ||
| Line 67: | Line 69: | ||
In the case where there is missing data, target revision becomes a range of: [target revision, revisions with missing data] | In the case where there is missing data, target revision becomes a range of: [target revision, revisions with missing data] | ||
This is important because we then have enough evidence to show that the regression is sustained through retriggers and over time. If there is suspect of alerts on other tests/platforms, please | This is important because we then have enough evidence to show that the regression is sustained through retriggers and over time. If there is suspect of alerts on other tests/platforms, please retrigger those as well. | ||
== Determining the scope of the regression from Perfherder == | == Determining the scope of the regression from Perfherder == | ||
Once you have the spot, you can validate the other platforms by [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/ | Once you have the spot, you can validate the other platforms by [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Perfherder_FAQ#Adding_additional_data_points adding additional data sets] to the graph. It is best here to zoom out a bit as the regression might be a few revisions off on different platforms due to [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing coalescing]. | ||
== Cases to watch out for == | == Cases to watch out for == | ||
| Line 77: | Line 79: | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_PGO pgo/nonpgo] (some errors are pgo only and might be a side effect of pgo). We only ship PGO, so these are the most important. | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_PGO pgo/nonpgo] (some errors are pgo only and might be a side effect of pgo). We only ship PGO, so these are the most important. | ||
* test/infrastructure change - once in a while we change big things about our tests or infrastructure and it affects our tests (we need bugs to document these those) | * test/infrastructure change - once in a while we change big things about our tests or infrastructure and it affects our tests (we need bugs to document these those) | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_a_merge Merged]] - sometimes the root cause looks to be a merge, this is a normall a side effect of | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_a_merge Merged]] - sometimes the root cause looks to be a merge, this is a normall a side effect of [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing Coalescing]. | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing Coalesed] - this is when we don't run every job on every platform on every push and sometimes we have a set of changes | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing Coalesed] - this is when we don't run every job on every platform on every push and sometimes we have a set of changes | ||
* Regular regression - the normal case where we get an alert and we see it merge from branch to branch | * Regular regression - the normal case where we get an alert and we see it merge from branch to branch | ||
= Tracking bugs = | = Tracking bugs = | ||
Every release of Firefox we create a tracking bug ( | Every release of Firefox we create a tracking bug (i.e. {{bug|1122690}} - Firefox 38) which we use to associate all regressions found during that release. The reason for this is 2 fold: | ||
* We can go to one spot and see what regressions we have for reference on new bugs or to follow up. | * We can go to one spot and see what regressions we have for reference on new bugs or to follow up. | ||
* When we [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_an_uplift uplift] it is important to see which alerts we are expecting | * When we [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_an_uplift uplift] it is important to see which alerts we are expecting | ||
| Line 94: | Line 96: | ||
* Product/Component - this should be the same as the bug which is the root cause, if >1 bug, file in [https://bugzilla.mozilla.org/enter_bug.cgi?product=Testing&component=Talos Talos] | * Product/Component - this should be the same as the bug which is the root cause, if >1 bug, file in [https://bugzilla.mozilla.org/enter_bug.cgi?product=Testing&component=Talos Talos] | ||
* Dependent/Block bugs - For a new bug, add the [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing#Tracking_bugs tracking bug] (for the current version) and root cause bug(s) as blocking this bug | * Dependent/Block bugs - For a new bug, add the [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing#Tracking_bugs tracking bug] (for the current version) and root cause bug(s) as blocking this bug | ||
* CC list - cc :jmaher, :avih, :wlach, : | * CC list - cc :jmaher, :avih, :wlach, :rwood, patch author(s) and reviewer(s), and owner of the tests as documented on the [https://wiki.mozilla.org/Buildbot/Talos/Tests talos tests wiki] | ||
* Summary of bug should have a check to make sure the revision is accurate | * Summary of bug should have a check to make sure the revision is accurate | ||
* The description is auto suggested as well, please verify the revision here | * The description is auto suggested as well, please verify the revision here | ||
| Line 109: | Line 111: | ||
== Merge Day - Uplifts == | == Merge Day - Uplifts == | ||
Every 6 weeks we do an [https://wiki.mozilla.org/Buildbot/Talos/ | Every 6 weeks we do an [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_an_uplift uplift]. These typically result in [https://elvis314.wordpress.com/2014/12/12/tracking-firefox-performance-as-we-uplift-the-volume-of-alerts-we-get/ dozens of alerts] for each uplift. | ||
The job here is to triage alerts as we usually do, except in this case we have a much larger volume of alerts. One thing here is we have alerts from the upstream branch. Take for example when we uplift Mozilla-Central to Mozilla-Aurora. We have a tracking bug for each release, and there is a list of bugs (keep in mind some are resolved as wontfix). In a perfect world (half the time) we can match up the alerts that are showing up on Mozilla-Aurora with the bugs that have already been filed. The job here is to verify and add bugs to keep track of what is there. | The job here is to triage alerts as we usually do, except in this case we have a much larger volume of alerts. One thing here is we have alerts from the upstream branch. Take for example when we uplift Mozilla-Central to Mozilla-Aurora. We have a tracking bug for each release, and there is a list of bugs (keep in mind some are resolved as wontfix). In a perfect world (half the time) we can match up the alerts that are showing up on Mozilla-Aurora with the bugs that have already been filed. The job here is to verify and add bugs to keep track of what is there. | ||
| Line 123: | Line 125: | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Alert_FAQ Alert FAQ] | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Alert_FAQ Alert FAQ] | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Noise_FAQ Noise FAQ] | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Noise_FAQ Noise FAQ] | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/ | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Perfherder_FAQ Perfherder FAQ] | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ Tree FAQ] | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ Tree FAQ] | ||
* [https://wiki.mozilla.org/Performance_sheriffing go to new page] | |||