Sheriffing/How To/Bisecting: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Created page with "'''Bisecting''' Sheriffs are doing bisecting to find out what changeset cause a failure if we can’t determine this via code inspection. = Fictitious Scenario: = After a m...")
 
No edit summary
Line 1: Line 1:
'''Bisecting'''
From the [https://en.wikipedia.org/wiki/Bisection_(software_engineering) Wikipedia article on bisection]:
<blockquote>''"Bisection is a method used in software development to identify change sets that result in a specific behavior change. It is mostly employed for finding the patch that introduced a bug."''</blockquote>


Sheriffs are doing bisecting to find out what changeset cause a failure if we can’t determine this via code inspection.
Sometimes sheriffs will need to perform a bissection to find out what changeset cause a failure if we can’t determine this via code inspection. This might happen for intermittent bugs or because tests were skipped due to [https://elvis314.wordpress.com/2015/02/06/seta-search-for-extraneous-test-automation/ SETA].


= Fictitious Scenario: =
Here's a little scenario to demonstrate the process.


= Example Bisection Scenario =
After a merge from integration to m-c the the xpcshell tests on linux-asan are busted and it’s not clear which changeset cause - inbound and autoland were fine.
After a merge from integration to m-c the the xpcshell tests on linux-asan are busted and it’s not clear which changeset cause - inbound and autoland were fine.


== '''So the Sheriff on duty starts bisecting:''' ==
== Examine the merge details ==
Changes of the merge (fictitious):


Changes of the merge (complete Fictitious)
2c497462f25e Merge inbound to m-c a=merge
02851079c451 Bug 1359458 - Increase assertion count range for test_bug437844.xul. r=jmaher
B37b46c7f38f Bug 1358241 - [1.1] Add mutex locking around the library handles cache. r=jchen
751455b663d0 Bug 1358241 - [1.2] Make direct library reference counter atomic to avoid mutex locking issues. r=jchen
841fa5fb06a8 Bug 1355676 - Check for nulls when decoding icons. r=sebastian
Ad9d525e6db7 Bug 1356243 - Enable Screenshots by default. r=Mossop
2e44294b9f5c Bug 1359273 - Split up DevTools' sort-arrows.svg to improve performance. r=jryans


2c497462f25e Merge inbound to m-c a=merge
== Preparation for bisecting ==
02851079c451Bug 1359458 - Increase assertion count range for test_bug437844.xul. r=jmaher
* Clone and/or update your mozilla-central repo. You should already have this as part of your [http://mozilla-version-control-tools.readthedocs.io/en/latest/hgmozilla/unifiedrepo.html unified repo].
B37b46c7f38f Bug 1358241 - [2.1] Add mutex locking around the library handles cache. r=jchen
* Add the try server settings (see https://wiki.mozilla.org/ReleaseEngineering/TryServer#Configuration )
751455b663d0 Bug 1358241 - [1.2] Make direct library reference counter atomic to avoid mutex locking issues. r=jchen
* Find the try syntax you need to run the specific test(s) you need. The [https://mozilla-releng.net/trychooser/ Trychooser] can help. In this scenario it is: <code>try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10</code>
841fa5fb06a8 Bug 1355676 - Check for nulls when decoding icons. r=sebastian
** The rebuild 10 means that you basically run the xpcshell tests 10 times instead of only once. This is '''really'' important if the bug is intermittent.
Ad9d525e6db7 Bug 1356243 - Enable Screenshots by default. r=Mossop
2e44294b9f5c Bug 1359273 - Split up DevTools' sort-arrows.svg to improve performance. r=jryans


== '''Requirements for bisecting''' ==
== Verify the failure ==
Confirm that you set everything up correctly and that you can reproduce the problem on try, i.e. do a try push with mozilla-central tip as topmost revision:
* Make a dummy change, e.g touch the [https://hg.mozilla.org/mozilla-central/file/c5ea72577f79/CLOBBER CLOBBER] file
** Use a commit message like the following: <code>hg commit -m “central tip try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10”</code>
* Push to try
** Check that you get the same failure.


-> Clone mozilla-central
== Bisection begins in earnest ==
-> Add the try server settings (see https://wiki.mozilla.org/ReleaseEngineering/TryServer#Configuration )
Bisection means cutting things in half, so split up the merge and check if the test failure already starts in the middle of the merge. In this case, let's say you decide use 841fa5fb06a8 as your new revision:
-> Find the try syntax (trychooser might be a big help). In this scenario its try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10  
* <code>hg up -r 841fa5fb06a8</code> to update to the older topmost rev
(The rebuild 10 means that you basically run the xpcshell tests 10 times and not once)
** You can confirm this with doing: hg summary and it should show something like
parent: 354839:841fa5fb06a8
Bug 1355676 - Check for nulls when decoding icons. r=sebastian
* <code>hg commit -m "central rev 841fa5fb06a8 try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10"</code>
* Push to try
* Check the results


== '''First step:''' ==
Let’s assume results for 841fa5fb06a8 are green. Now you need to do the same steps again for the later changes (02851079c451 and B37b46c7f38f  / 751455b663d0 ) to check which of this 2 bugs caused the issue.


To make sure/give yourself confirmation that you set everything up correctly and that you can reproduce the problem on try, do a try push with m-c tip as topmost revision.
== Backing out and follow-up ==
Once you've found the bad changeset, follow the instructions to [[Sheriffing/How:To:Backouts|back it out]]. In most cases, you will be bisecting after the problem code has been merged around to different branches, so you will need to back it out from more than one branch. For this reason, you shouldn't offer developers the chance for a follow-up fix.


Make a dummy change (like touch clobber or so to have a change).
== Caveats ==
Commit with something like  -m “central tip try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10”
* When bisecting, you can push up to 6 pushes to Try at the same time to be able to have results ASAP.
Push to try
** Be considerate though. Running any Try jobs with the <code>--rebuild</code> parameter set will tie up more resources than normal and will impact your fellow developers if the trees are not closed.
Check that you get the same failure etc
 
== '''Second Step - now bisecting begins''' ==
 
Now it’s on you to start bisecting.
 
There are several approaches possible, one would be to split up the merge and check if the test failure already starts in the middle of the merge. So you decide to run a try run with 841fa5fb06a8  as topmost revision.
 
 
-> hg up -r 841fa5fb06a8 to update to the older topmost rev
-> you can confirm this with doing: hg summary and it should show something like
    parent: 354839:841fa5fb06a8
    Bug 1355676 - Check for nulls when decoding icons. r=sebastian
 
-> hg commit -m "central rev 841fa5fb06a8 try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10"
 
-> push to try
 
-> Check the results - let’s assume results for 841fa5fb06a8 are green, so you need to do the same steps again for the latter changes (02851079c451 and B37b46c7f38f  / 751455b663d0 ) to check which of this 2 bugs cause the problem.
 
 
Note: in case of bisecting for an actual issue you can push up to 6 pushes to try at the same time to be able to have results asap.
canmove, Confirmed users
2,850

edits

Navigation menu