Sheriffing/How To/Beta simulations
Open the doc with the past beta simulations and known issues - Request the link and edit access in the sheriffs IRC channel if you are new to this.
Add the new day for which you’re making the simulation by copying the entry for the last day and modifying the date.
Go to mozilla-central, take the HG link from the latest central push and add it as a base revision in the document.
SILENCING KNOWN PERMAFAILS
If a permanent failure is known from previous simulations but not fixed according to the bug, it's helpful to prevent this failure to reduce the need for failure classifications and to provide a better picture when one looks at the whole beta simulation. These are the possibilities:
- Apply fix: If there is a patch in the bug about the perma failure - which hasn't landed on mozilla-central yet - , apply that similar to how patches are landed on inbound (DON'T PUSH)
- Fix on phabricator: If there is a fix hosted on phabricator, import it to migitate the issue. By default, the
arc patchcommand will import all revisions a patch dependens on (has as anchestors). If it's a patch series, import the last patch to also auto-import the previous ones:
arc patch D<number> --nobranch
The patch number is mentioned at the top of the phabricator page for the patch.
If the dependencies should not be imported, e.g. because they are not needed, they can be skipped like this:
arc patch D<number> --nobranch --skip-dependencies
- Backout change causing failure: If the bug about the perma failure mentions the bug whose check-ins cause the failures, those can be backed out locally for the beta simulation:
hg oops -esr/er <revisions> -n <bug_number>//The revisions of the bug whose backout fixes the issue. The backout revisions will have to be deleted with hg histedit once the beta simulations have been created.
- Disable failing test: Similar to how frequently failing tests are disabled. Disadvantage: Additional failures or changes in the failure message in the test get missed.
TRUNK AS EARLY BETA
- Open the console and run the following commands:
hg pull central
hg update central
./mach try release -v 68.0b1 --tasks release-sim --migration central-to-beta
- -v 68.0b1: Sets the version number to use in the beta simulation. Replace "68" with the version number mention in the beta simulation document.
- --tasks release-sim: Activates the tasks which shall run for beta simulations to get scheduled.
- --migration central-to-beta: Activates modification of the configuration to switch from central to beta.
This changes the configuration to the beta simulation, pushes to the Try server and reverts the changes. Open the first Treeherder link in the console. It might take a few minutes for Treeherder to find that job, though. Add the link after you deselect the running and the green jobs and check that the classified jobs are visible, in the current date section at Run Links: Trunk as Early Beta in the Gdoc.
TRUNK AS LATE BETA
Go to the console and use the following commands:
./mach try release -v 68.0b12 --tasks release-sim --migration central-to-beta --migration early-to-late-beta
Note the different version number 68.0b12 compared the early beta's 68.0b1.
Add the link in the document after you deselect running and green jobs, in the current date section at RUN LINKS Trunk as Late Beta.
Also, in the New Bugs Section - you can write TBD (“To Be Done”) until the jobs on your try pushes have ran and you can see if there are failures or not.
NOTE: When creating a bug, add tracking only for permafails. Low frequent intermittents or frequent intermittents also affecting mozilla-central don’t need that flag but should be created and not be added to the Google Doc about the simulations.
When done, clean up the repository:
hg purge: Removes the backup files generate when the command to generate the beta simulations modified configuration files.
After the beta simulations are done:
- Remove the “TBD” from the bug list for today’s simulations if there is no new failure and replace it with “None”
- If there were known issues before the beta simulations which you didn’t fix with a backout or patch import and which are gone, open those bugs (should be “Resolved Fixed”) and set the status_firefox<version number> to “verified” and change “Resolved Fixed” to “Verified Fixed”. Also add a comment that it has been verified fixed and add the link to today’s simulation.
VERSION INCREASE SIMULATION
Helpful tip: To have a more accessible command when pushing to try, open the .hgrc file and below the [alias] line add trymc = push -f ssh://hg.mozilla.org/try
To check if the next increase of the application version will cause failures, you need to open the gdoc and search for a previous Version Increase link.
In your console import the link from the Version increase link from the doc.
hg import <link>
hg histeditto remove the commit for the version increase.
Bugs for permanent or frequent issues found for simulations
Don't use treeherder's interface to create bugs for issues found during the simulations if they are permafailing or frequent (at least one retrigger has the same failure), but create the new bug directly at Bugzilla because more of its features are needed.
- Pick the product and component in which the bug shall be created. Use the same guidelines like for intermittent failures.
- If the issue is permanent, start it with "Perma" else with "Intermittent".
- If it affects either only early or late beta, write early beta or late beta.
- If only platform or platform type (e.g. DevEdition) is affected, mention the name of the platform on which it fails.
- Then append the failure line and eventually
- for beta simulations:
when Gecko XX merges to Beta on YYYY-MM-DD. XX is the current version number for mozilla-central as used in the "Target" dropdown and "status-firefoxXX" field. The date is the day mozilla-central gets completely merged for the first time to mozilla-beta in that release cycle. It's mentioned in the Gecko beta simulation document to which you should have access.
- for version increase simulations:
when Gecko gets increased to version XX+1 on YYYY-MM-DD. XX is the current version number for mozilla-central as used in the "Target" dropdown and "status-firefoxXX" field. The date is the day the version number gets increased (usually directly after the last merge from mozilla-central to mozilla-beta). It's mentioned in the Gecko beta simulation document to which you should have access.
- status fields: Near the bottom of the page to create the bug, click on "Set Bug Flags".
- Set status-firefoxXX with the highest version number (= mozilla-central) to affected.
- Set all other status flags starting with status-firefox to unaffected. That way it's obvious on what branches the issue occurs.
- tracking field: Next to the status-firefoxXX field which you set to affected there is a tracking-firefoxXX one (same version number). Set it to ?. Remove the text added to the textarea to explain why you request it to be tracking.
- Treeherder link: Write central-as-beta simulation or version increase simulation and put the link to the simulation after it. That link must also show classified failures.
- Failure log: Write Failure log: and paste a link to the log of such a failure.
- Copy and paste relevant lines from the log.
- CC Aryx.
- Identify what caused the failures. If you identified the bug, put it into the Regressed By field.
- Beta simulations
- Get the file the relevant file from the failure line. Search hg.mozilla.org for it at the top right. Check the changes since the last beta simulation for corelation.
- If that didn't help, get a list of all changes since the last beta simulation. Construct an url like this:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=revisionHashPreviousSimulation&tochange=revisionHashToday, e.g. like this. The revision hashes are the hexadecimal strings at the end of each base revision mentioned in the Gecko Beta simulation document, e.g.
98b223de0543(as shown in Treeherder). Look through the list if you notice something suspicious and inspect the change if necessary.
- Last resort: Get all code changes since the last beta simulation:
- In the command line, run:
hg diff -r revisionHashPreviousSimulation revisionHashToday > ../changes.diff
- A files changes.diff has been created in the folder which contains the mozilla-central folder. Open it (file is usually 3-10 MB) and search for suspicious words from the failure line or shortly before.
- If you found something interesting, scroll up to find the file (line starts with diff) and continue like mentioned above if you found new changes to the file which causes the failure.
- In the command line, run:
- Beta simulations
In the Gecko beta simulation, add the new bug under New Bugs filed for the current day. Use this format:
[prefix]Bug XXX - Bug summary.
[prefix] is either
[Trunk as Beta],
[Trunk as Early Beta] or
[Trunk as Late Beta].
Then copy and paste it at the end of the list of known issues near the top of the document.
Classify the failures as expected fail and put the bug number in the comment field. These are known important issues (tracking-firefoxXX) which often permafail, daily or weekly reminder about those as 'intermittent' failures provide little value.
Verify bugs as closed
After the simulations are complete, check if any issues mentioned in the list of known bugs near the top of the Gecko Beta simulation have not occurred again. If they are permafailures, update those bugs as Verified fixed (if currently Resolved fixed) or Resolved Worksforme (if currently open). Then remove the bug from the list of known issues.