Sheriffing/How To/Beta simulations
FOR DEVELOPERS: Don't try to create central-as-beta simulations as artifact builds.
Open the doc with the past beta simulations and known issues - Request the link and edit access in the sheriffs Element channel if you are new to this.
Add the new day for which you’re making the simulation by copying the entry for the last day and modifying the date.
Go to mozilla-central, take the HG link from the latest central push and add it as a base revision in the document.
SILENCING KNOWN PERMAFAILS
- If you need to import patches or backout something, make sure to start with an up to date local copy:
hg pull central
hg update central
If a permanent failure is known from previous simulations but not fixed according to the bug, it's helpful to prevent this failure to reduce the need for failure classifications and to provide a better picture when one looks at the whole beta simulation. These are the possibilities:
- Fix on phabricator: If there is a fix hosted on phabricator, import it to migitate the issue. By default, the
moz-phab patchcommand will import all revisions a patch depends on (has as ancestors). If it's a patch series, import the last patch to also auto-import the previous ones:
moz-phab patch D<number> --apply-to .
The patch number is mentioned at the top of the phabricator page for the patch.
If the dependencies should not be imported, e.g. because they are not needed, they can be skipped like this:
moz-phab patch D<number> --apply-to . --skip-dependencies
- Back out change causing failure: If the bug about the perma failure mentions the bug whose check-ins cause the failures, those can be backed out locally for the beta simulation:
hg oops -esr/er <revisions>//The revisions of the bug whose backout fixes the issue. The backout revisions will have to be deleted with hg histedit once the beta simulations have been created.
- Disable failing test: Similar to how frequently failing tests are disabled. Disadvantage: Additional failures or changes in the failure message in the test get missed.
TRUNK AS EARLY BETA
- Open the console and run the following commands:
hg pull central- needed only if you didn't import/backout anything
hg update central- needed only if you didn't import/backout anything
./mach try release -v 104.0b1 --tasks release-sim --migration central-to-betaIf not all the jobs shall run (e.g. to verify a test fix), append
--stage-changesto the command,
hg commit -m "Early beta sim"and select the jobs with
./mach try chooser --no-artifact
- -v 104.0b1: Sets the version number to use in the beta simulation. Replace "101" with the version number mention in the beta simulation document.
- --tasks release-sim: Activates the tasks which shall run for beta simulations to get scheduled.
- --migration central-to-beta: Activates modification of the configuration to switch from central to beta.
This changes the configuration to the beta simulation, pushes to the Try server and reverts the changes. Open the first Treeherder link in the console. It might take a few minutes for Treeherder to find that job, though. Add the link after you deselect the running and the green jobs and check that the classified jobs are visible, in the current date section at Run Links: Trunk as Early Beta in the Gdoc.
TRUNK AS LATE BETA
To create a central-as-late-beta simulation, go to the console and use the following commands:
./mach try release -v 104.0b12 --tasks release-sim --migration central-to-beta --migration early-to-late-beta
Note the different version number 104.0b12 compared to the early beta's 104.0b1.
Add the link in the document after you deselect running and green jobs, in the current date section at RUN LINKS Trunk as Late Beta.
Also, in the New Bugs Section - you can write TBD (“To Be Done”) until the jobs on your try pushes have ran and you can see if there are failures or not.
NOTE: When creating a bug, add tracking only for permafails. Low frequent intermittents or frequent intermittents also affecting mozilla-central don’t need that flag but should be created and not be added to the Google Doc about the simulations.
When done, clean up the repository:
hg purge: Removes the backup files generate when the command to generate the beta simulations modified configuration files.
After the beta simulations are done:
- Remove the “TBD” from the bug list for today’s simulations if there is no new failure and replace it with “None”
- If there were known issues before the beta simulations which you didn’t fix with a backout or patch import and which are gone, open those bugs (should be “Resolved Fixed”) and set the status_firefox<version number> to “verified” and change “Resolved Fixed” to “Verified Fixed”. Also add a comment that it has been verified fixed and add the link to today’s simulation.
VERSION INCREASE SIMULATION
To check if the next increase of the application version will cause failures, you need to open the gdoc and search for a previous Version Increase link.
In your console import the link from the Version increase link from the doc.
hg import link_url
- If there are no conflicts, continue with the next line. If you see conflicts, then they are likely from the increase of the version number in mozilla-central earlier to the version which your patch tries to increase to. The version has to be increased compared to the number in the mozilla-central code (the "local" part of the conflict).
./mach try fuzzy -q '^test- !browsertime !raptor'
hg histeditto remove the commit for the version increase.
BETA AS RELEASE SIMULATION
This should be run weekly after all the uplifts to beta for the Thursday beta are done. Skip it if it's the last week of the Nigthly cycle - then mozilla-beta has already been merged to release. It's based on mozilla-beta and simulates the behavior when it switches to mozilla-release.
hg pull beta
hg update beta
./mach try release -v 103.0 --tasks release-sim --migration beta-to-release
103.0 should be replaced with the current version in mozilla-beta (should be one version less than in mozilla-central which is mentioned in Gecko XX beta simulation document). Issues found should reported similar to bugs for beta. Differences:
- Set status-firefoxXX to affected for the two highest version numbers.
- Set tracking to "?" for the second-highest version number.
- In the bug summary, use the merge date for the mozilla-beta to mozilla-release merge which gets mentioned near the top of the Google document about the simulations. E.g.
when Gecko 68 merges to release on 2019-07-01
- In the list of new bugs, prefix the bug with "[Beta as Release]".
- In the list of currently open bugs at the top, add them into the section "Beta as Release".
Bugs for permanent or frequent issues found for simulations
Don't use treeherder's interface to create bugs for issues found during the simulations if they are permafailing or frequent (at least one retrigger has the same failure), but create the new bug directly at Bugzilla because more of its features are needed.
- Pick the product and component in which the bug shall be created. Use the same guidelines like for intermittent failures.
- If the issue is permanent, start it with "Perma" else with "Intermittent".
- If it affects either only early or late beta, write early beta or late beta.
- If only one platform or platform type (e.g. DevEdition) is affected, mention the name of the platform on which it fails.
- Then append the failure line and eventually
- for beta simulations:
when Gecko XX merges to Beta on YYYY-MM-DD. XX is the current version number for mozilla-central as used in the "Target" dropdown and "status-firefoxXX" field. The date is the day mozilla-central gets merged to mozilla-beta. It's mentioned in the Gecko beta simulation document to which you should have access.
- for version increase simulations:
when Gecko gets increased to version XX+1 on YYYY-MM-DD. XX is the current version number for mozilla-central as used in the "Target" dropdown and "status-firefoxXX" field. The date is the day the version number gets increased (= same day on which mozilla-central merges to mozilla-beta). It's mentioned in the Gecko beta simulation document to which you should have access.
- status fields: Near the bottom of the page to create the bug, click on "Set Bug Flags".
- Set status-firefoxXX with the highest version number (= mozilla-central) to affected.
- Set all other status flags starting with status-firefox to unaffected. That way it's obvious on what branches the issue occurs.
- tracking field: Next to the status-firefoxXX field which you set to affected there is a tracking-firefoxXX one (same version number). Set it to ?. Remove the text added to the textarea to explain why you request it to be tracking.
- Treeherder link: Write central-as-beta simulation or version increase simulation and put the link to the simulation after it. That link must also show classified failures.
- Add a link 'How to run such simulations' which point here to let developers run these on their own.
- Failure log: Write Failure log: and paste a link to the log of such a failure.
- Copy and paste relevant lines from the log.
- CC Aryx.
- Identify what caused the failures. If you identified the bug, put it into the Regressed By field.
- Beta simulations
- Get the file the relevant file from the failure line. Search hg.mozilla.org for it at the top right. Check the changes since the last beta simulation for correlation.
- If that didn't help, get a list of all changes since the last beta simulation. Construct an url like this:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=revisionHashPreviousSimulation&tochange=revisionHashToday, e.g. like this. The revision hashes are the hexadecimal strings at the end of each base revision mentioned in the Gecko Beta simulation document, e.g.
98b223de0543(as shown in Treeherder). Look through the list if you notice something suspicious and inspect the change if necessary.
- Last resort: Get all code changes since the last beta simulation:
- In the command line, run:
hg diff -r revisionHashPreviousSimulation::revisionHashToday > ../changes.diff
- A files changes.diff has been created in the folder which contains the mozilla-central folder. Open it (file is usually 3-10 MB) and search for suspicious words from the failure line or shortly before.
- If you found something interesting, scroll up to find the file (line starts with diff) and continue like mentioned above if you found new changes to the file which causes the failure.
- In the command line, run:
- Beta simulations
In the Gecko beta simulation, add the new bug under New Bugs filed for the current day. Use this format:
[prefix]Bug XXX - Bug summary.
[prefix] is either
[Trunk as Beta],
[Trunk as Early Beta] or
[Trunk as Late Beta].
Then copy and paste it at the end of the list of known issues near the top of the document.
Classify the failures as expected fail and put the bug number in the comment field. These are known important issues (tracking-firefoxXX) which often permafail, daily or weekly reminder about those as 'intermittent' failures provide little value.
Confirm patches as fixing the issue
If patches to fix central-as-beta issues got imported for a simulation and the issue hasn't been observed for the simulation, please comment in the bug that the patch fixes the issue and link to the simulation.
Verify bugs as closed
After the simulations are complete, check if any issues mentioned in the list of known bugs near the top of the Gecko Beta simulation have not occurred again. If they are permafailures, update those bugs as Verified fixed (if currently Resolved fixed) or Resolved Worksforme (if currently open). Then remove the bug from the list of known issues.