Release Management/Release Process Checklist Documentation
The goal of this page is to document the Release Process Checklist being used by Firefox Release Managers to track each release throughout the cycle. Any changes to this documentation or the checklist should be reflected in both documents.
Given the nature of how nightly builds are created and shipped, the role of the release manager during this phase of the cycle skews much more heavily to the monitoring aspect rather than release mechanics.
Prior to the start of the cycle, the follow tasks need to be performed:
- Create a milestones document. This will ideally be ready at least two weeks prior to the start of the cycle, with feedback received from key stakeholders (QA, RelEng, RelMan) prior to wider publishing. There is not currently a template for this document because it’s still evolving from release to release, so it is advised to use the document from the previous release as a starting point.
- Verify that the release calendar is up to date. This can be done in conjunction with the milestones document or after it’s published.
- Ensure that the release has a Regression Engineering Owner identified. Ultimately, ownership of this task falls within the Firefox engineering org. However, it is good for the release manager to ensure that this doesn’t get stalled.
On a daily (or thereabouts) basis, the following items should be monitored:
- Pending tracking-firefoxXX requests. These are bugs which have been nominated for extra tracking during the cycle. A decision needs to be made about whether the bug indeed warrants that additional attention, and possibly even blocker status. At the beginning of the cycle, bug queries will need to be created for this purpose. Once that query exists, the item in Column A can be updated to a link. (TODO: document tracking decision making process)
- Open tracking-firefoxXX+ and blocking bugs. The main purpose of this step is to ensure that bugs falling into these categories don’t stagnate. Where possible, release managers should ensure that the bug is in the right component, has an appropriate assignee (for either investigating or fixing, depending on the stage of the bug), and is in general making progress (and poking if it doesn’t appear to be). In the case of blocker bugs, expediting a fix or backout may become necessary. At the beginning of the cycle, bug queries will need to be created for this purpose. Once that query exists, the item in Column A can be updated to a link.
- Newly-filed regression bugs. This can be done in conjunction with the Regression Engineering Owner of the release. Bug queries should be available from the Platform wiki page. New regressions are generally the most important to track on a regular basis, but the carry-over regression lists can also surface bugs which have fallen off the radar which may require reprioritization.
- Review stability rates and reported crash spikes. This can be spikes detected by automation, which sends email which is usually monitored by the stability team. Release managers may also want to pay attention to these spikes and help file bugs. Also keep an eye on stability through monitoring the Mission Control rate and top crashes on crash-stats.
On a weekly basis, a review of the Firefox Trello board should be done to monitor the status of features currently targeting that release. This can be done in conjunction with the weekly cross-functional meeting or Nightly feature deep-dive meeting. Release managers are also encouraged to watch the list for their release in order to receive notifications for any changes in status.
The release manager should also review the test plans for features targeting their releases to become familiar with how the feature works and how we intend to ensure it is of sufficient quality to ship in that release. This also gives an opportunity to provide feedback on the risk analysis and mitigations put in place.
Once mid-Nightly and pre-Beta test reports are emailed by QA, the release manager should check the newly-reported bugs to make sure flags are set correctly and the new issues have been addressed by engineering and product teams and prioritized accordingly. Release managers can help make sure that all of the pertinent information is in place to make decisions about whether the feature is ready to ship to Beta or whether it should remain Nightly-only for another cycle for more testing and development.
Near the end of the cycle, the following actions must be performed:
- Send Nightly soft freeze reminder to dev-platform & firefox-dev. This should be done a week before the start of the soft freeze to remind developers that the window for landing riskier fixes is coming to a close until after the version bump.
- Create the Release Process spreadsheet. This must be done prior to the first merge of mozilla-central to mozilla-beta in time for the b1 build. Duplicate the existing template.
- Prepare Beta release notes. This must be done before the release goes to the wider Beta audience (after the final merge to Beta and Nightly version bump has happened). (TODO: document the release note creation process)
Once a release moves to the Beta channel, the daily tasks performed during the Nightly cycle will continue to be carried out as new bug reports come in from a wider audience and new features move through the QA cycle towards shipping.
There is also an added triage step during Beta - monitoring the “missed uplifts” email or queries to find issues fixed in Nightly but that still affect Beta. The release owner should check these issues to assess whether uplift is a good idea. If not, then the issue should be marked wontfix for Beta.
To ship a release, a series of steps must be taken with various roles representing multiple teams expected to contribute. In the Release Process spreadsheet, the Beta Checklist template tab should be duplicated for each new release for proper tracking. After a release is shipped, the tab can be hidden in order to minimize clutter.
- Review tracking-firefoxXX+ bugs and approval requests. As noted above, regular triage of tracking+ bugs and uplift approval requests must be performed. Approval requests can be viewed via the Release Tracking Report on Bugzilla.
- Verify all approved bugs landed on mozilla-beta. After approving patches for uplift, they must be pushed to the mozilla-beta repository. This task can be performed by the Tree Sheriffs (#sheriffs on IRC) or by the release manager themselves depending on their comfort level. The longer-term goal is to automate the process.
- Set up builds in ship-it (Desktop, DevEdition, Fennec). Ship-it is the tool used for scheduling the release process, starting with the creation of the builds (picking a revision, verifying the version number, etc) and the eventual pushing of those builds to the release mirrors and website. Access to ship-it requires being connected to the Mozilla VPN.
Standard practice is that the first two Betas are created for the DevEdition release only during the soft freeze week when mozilla-central and mozilla-beta are sharing a common Gecko version. The first desktop Firefox Beta release is therefore b3. Also, due to limited QA resources, Fennec betas are typically only created for odd-numbered builds. However, it is within the release manager’s discretion to create an even-numbered Fennec beta build when warranted. In that situation, QA must be informed of the coming build so they can plan accordingly.
- Treeherder tests green/starred. Treeherder is the primary dashboard for monitoring the results of builds and tests. It is the responsibility of the sheriffs to monitor the Beta repository and ensure that tests are passing, though the release manager can also keep an eye on things. Builds should not be started until CI has passed to avoid shipping defective code to end users.
- Start builds from ship-it. Once CI results are good, the process of generating the builds (go-to-build) is started by clicking the promote button for each release.
- Confirm builds have started. Emails will be sent to the release-signoff mailing list once builds have started and a notification will be posted in the #releaseduty IRC channel.
- Confirm notification sent when builds finish. An automated email will be sent to the release-signoff mailing lists once the release promotion process is finished.
- Schedule push to CDN. After the initial builds are completed, they will be located in the /candidates directory of the main Mozilla FTP server. Prior to widespread shipping, the builds must also be pushed out to CDN mirrors. Because it is difficult and time-consuming to un-ship releases once they have been pushed to CDNs, this step must be performed after it is confirmed that the created builds are satisfactory. For Beta releases, this should wait until after the update-verify tasks have passed to ensure the integrity of the partial updates created during the release promotion process. The CDN push is started from ship-it by clicking the push button for each release. This is typically the final step for the go-to-build day itself.
- Confirm notification sent when CDN push finishes. An automated email will be sent to the release-signoff mailing lists once the push to cdntest has finished successfully.
- QA manual testing signoff for Desktop/DevEdition. Ask in the #qa-coordination Slack channel if there are questions about progress.
- Update test on beta-cdntest. Once QA has performed update testing on the cdntest update channel, the builds are ready to ship.
- Push Desktop/DevEdition to Beta. The release is scheduled in ship-it by pushing the ship button for each release.
- Signoff on scheduled rule change in Balrog. Once the push to Beta is completed, the new updates rules will need signing off in Balrog (where all update rules are managed). Access to Balrog requires being connected to the Mozilla VPN. Note that Desktop Firefox releases are signed off on the Beta channel and DevEdition releases are signed off on the Aurora channel. Also confirm in the #releaseduty IRC channel that someone from RelEng is available to sign off on the rule change. Also verify that the scheduled rollout % is consistent with what is specified in the milestones document.
- Verify that the Balrog rule changes are live. In order to verify that the rule changes have taken proper effect, refresh the page to confirm that the changes are no longer showing as pending.
- QA manual testing signoff for Fennec. Ask in the #qa-coordination Slack channel if there are questions about progress.
- Push Fennec to Play Store. This is done within ship-it.
- Verify that new release is live on Google Play at desired rollout %. By default, Beta will default to a 10% rollout when publishing to the Play Store. Needs to be adjusted after pushing to the rollout % specified in the milestones doc.
Once we’ve reached a point in the cycle where 100% rollout is desired as the default behavior, a commit like the one below should be pushed to the Beta repository to update the default behavior.
- Email release-signoff with confirmation that updates are live. This is generally just a reply to QA’s “please push to Beta/Aurora” email. Be sure to note the rollout % as well.
- Update tests on Aurora & Beta. Final verification by QA that updates are working on the live update channels.
This tab is for tracking bugs which are being tracked for possible uplift to the mozilla-release repository for RC builds. The primary objectives are:
- Track whether there are any drivers for a respin of the RC builds during RC week.
- Assess whether Desktop, Fennec, or both are affected by the issues noted.
- Verify that all drivers have had an explicit decision made.
The RC checklist, like the Beta checklist, should be cloned for each RC build created (RC1, RC2, etc). Most of the steps for the RC checklist are the same as the Beta checklist, but with a few notable differences as discussed below.
- In ship-it, click "release" to set up the build. The release date/ETA should be 1pm UTC (6am Pacific) for the projected release date unless otherwise arranged. Build 1 will be your RC1. If you need an RC2 then cancel build 1 and start RC2 (check these steps with ryan/releng before doing it)
- Sample partials (for 66.0 RC): 65.0.2build1,65.0.1build2,65.0build2,64.0.2build1,66.0b14build1
- Update test on beta-cdntest; QA will email release-signoff to push Desktop to Beta at 100%. Relman can then click "push RC" in ship-it. (RC builds can be pushed to Beta users once they receive sign-off from QA.) Pushing to release users is covered by the Go-Live Checklist elsewhere.
- Email release-signoff to push Fennec to Play Store at 5%. Because of how the Play Store works, we are unable to ship Fennec RC builds to Beta users prior to release like we do for Desktop Firefox builds. In order to get pre-release testing coverage, Fennec RC builds are therefore pushed release users on the Play Store at 5% once they receive QA sign-off.
- WNP testing on release-localtest. RC week is when testing of the What’s New Page for the new release commences. This is done on the release-localtest channel by QA and a sign-off email will be sent once testing has been completed. It’s not necessary for every RC build to go through this testing as long as there has been a successful sign-off by the end of RC week.
RC week is also the time to finalize release notes and begin gathering feedback from the #release-notes Slack channel.
The What’s New Page has been something which has suffered from coordination problems in the past, since it requires contributions from Marketing, Localization, Web Development, and QA. A meeting should be held a few weeks prior to Go-Live to establish a timetable for the steps listed in the checklist.
Similar to the Beta and RC checklists, there are many common steps which have been previously covered above. Items specific to Go-Live are noted below.
Prior To Launch Day
- Gather feedback for release notes. As noted in the RC Checklist section, the release notes will go throw review and revision by the UX and Marketing teams. Once the draft is ready to be shared, do so in the #release-notes Slack channel and then incorporate the revisions provided once ready.
- Request Legal signoff on relnotes. This is usually requested by the UX team. Confirm that it was requested and that signoff was granted (must be confirmed prior to launch day).
- Check for crash spikes with RC builds. We must verify that there are no obvious crash spikes in the pre-release data from the RC builds.
- Schedule push to CDN (ship-it). This should be done on the day prior to Go-Live so that the release is staged on the mirrors and verified working prior to launch day.
- Schedule push to release at 25% (ship-it). Go-Live time is usually 6am PT on launch day, but this can be done ahead of time with a scheduled rule change in Balrog.
- Make release notes live. There is a 15-20min lag between making the change in Nucleus and the live website picking up this change, so plan to do this 15-30min prior to go-live.
- Sign-off on scheduled rule change in Balrog. Assuming that the change is scheduled for 6am PT, this can be signed off ASAP to avoid unnecessary delays at go-live time.
- Bump Fennec update rate to 25% in Play Store. Should have been at 5% previously.
- Verify that new release is live on mozilla.org. Verify that download requests are pointing to the new version. This can probably be moved to the delivery dashboard.
- Email release-drivers & release-signoff that updates are live at 25%. Once the release is confirmed to be live, send an email to release-signoff & release-drivers confirming for that audience that the release has been pushed. Also confirm the rollout %.
- Update tests on Release; WNP testing on Release. QA will send a sign-off email when this is completed.
- Verify that the release notes are live.
- Verify versions in firefox_versions.json and mobile_versions.json. This can probably be moved into a step where we verify a number of things on the delivery dashboard.
- Security advisories go live. This will be handled by the security team post-launch.
- Email announce list. Send an email confirming the new release, following the general form shown on the release wiki page.
- Schedule Desktop update rate to 0% in Balrog after 24 hours. It is recommended to do this on launch day to avoid forgetting about it the day after. This change can be made in Balrog by either RelEng or RelMan, though both will need to sign off on the rule change afterwards.
- (Launch Day +1) Verify Desktop update rate at 0% in Balrog.
- (Launch Day +1) Email release-signoff & release-drivers to confirm 0% throttling. Once updates are confirmed to be throttled, email release-signoff & release-drivers confirming that the change is live. This can be a reply-all to the previous push emails to keep the history in one thread.
- (Launch Day +2) Review release crash rates and incoming bugs for new blockers. There won’t be much new data yet two days after release, but any obvious crash spikes or critical regressions will likely be known.
- (Launch Day +2) Bump Desktop update rate to 100% in Balrog. If there are no known quality issues, full rollout to the Desktop release population can proceed. Change the rollout value in Balrog and ping in the #releaseduty IRC channel to get RelEng sign-off of the rule change.
- Email release-signoff & release-drivers to confirm full rollout. Once Desktop updates are bumped to 100%, email release-signoff & release-drivers to confirm. This can be a reply-all to the previous push emails to keep the history in one thread.
- Ship new Desktop release in Ubuntu Snap Store. This can be done once the Desktop update rate is bumped to 100%. Documentation for managing Snap releases.
- Bump Fennec update rate to 99% in Play Store. The Fennec update rate is bumped to 99% because the Play Store doesn’t provide a mechanism for un-releasing a version once it reaches 100% deployment. Going to 99% allows most users to receive the new release while preserving the ability to throttle if a new issue is found. The rate is typically bumped to 100% the following week if no major quality issues have arisen or with the release of the first Fennec dot release.
- Upload Fennec in the Samsung store. This is handled manually by Sylvestre at the moment.
- (Monday After Launch Week) Bump Fennec update rate to 100% in Play Store. If no major quality issues have arisen with the release and a dot release is not being actively planned, bump the rollout to 100% the week after launch.
This may not be the best long-term place to track the off-train rollouts affecting a given release (i.e. Normandy, GoFaster, SHIELD studies, etc), but until a better dashboard exists, this provides a place to keep track of things. More information on off-train releases.
Dot Release Uplifts
Similar to the RC Uplifts tab. The primary purpose of this tab is to track any bugs driving a dot release, bugs which are under consideration to ride-along with a dot release if one is created, and to assess which products are affected by any drivers.
Dot Release Checklist
The checklist for dot releases is essentially a combination of the RC and Go-Live checklists and the items should be treated mostly the same (except under chemspill situations where an accelerated time table may apply).
Of note, however, is the need for the release manager to email the release-drivers list prior to the creation of the builds (ideally shortly after the decision is made to go forward with the release) to notify all stakeholders of the forthcoming release. Also, the release manager should verify that the rollout percentages in Balrog & Google Play for the current release are set as expected (taking into account any blocking quality issues) to avoid unexpected fallback versions when the new release ships. Finally, if there are security fixes being included in the release, email abillings (or whoever from the security team handles CVEs and security advisories) to ensure that they are aware of the bugs being fixed.