Release Management/Rapid Betas
⚡ Warning: The content of this page is obsolete and kept for archiving purposes of past processes.
This plan has been implemented in 2012. This page is kept just-in-case.
Contents
- 1 Benefits of Daily Betas
- 2 Proposed Timeline
- 3 In-Product
- 4 Automation
- 5 QA
- 6 Support
- 7 RelEng/IT
- 8 Crash-Kill Team & Soccorro
- 9 L10N
- 10 Release Management
- 11 Metrics
- 12 Product Marketing & PR
- 13 Project Status
Benefits of Daily Betas
- shorter verification turn-around time
- freed up RelEng/QA resources
- late-breaking beta issues can be much more easily resolved
- more user testing of the latest bits over the 6 week cycle
Proposed Timeline
- -4/23
- Consider major pain points, begin scoping work.
- 4/24-6/4
- Complete scoping work, begin implementation
- Communicate the plan to Mozilla community
- Move to Tuesday go, Wednesday ship? (see QA questions below)
- Any necessary in-product changes should land while FF15 is on m-c
- 6/5-7/16
- Complete implementation, begin process testing on Nightly/Aurora
- Week of 7/17
- Push out our first daily beta while FF15 on mozilla-beta
- Update: We're now targeting 10/9 for the launch date for Rapid Betas, given higher B2G priorities in involved groups
In-Product
Questions
- What do we do about non-admin users? What's the current experience? Can we back off on offering them an update by 7 days?
- should be possible be possible by only incrementing the update xml's appVersion attribute once a week see bug 753103
- Will the user be prompted at all to "apply the update now"? Can we disable that?
- If the user's browser downloads an update, then the user doesn't close and open their browser for 3 days, what build will be applied?
- Is the user going to be prompted every day to update, or will it be truely silent and the user won't notice it until the next day.
- Will our current beta users be happy with the additional bandwidth of potentially downloading updates every day?
Blockers
- Background updates will be landing in FF15.
- Feature page
- Tracking bug 307181, bug 753103
- Telemetry bugs (for getting uptime data): bug 727184 - Increase granularity of telemetry uptime measurement, bug 733591 - Send uptime with telemetry data
- Decreasing prompt level since download (if you don't close the browser): bug 754470 - Once we have daily betas, we should change the beta channel to prompt users about a downloaded update once a week
Automation
Previous Proposal
QA
- Post: Perform triage of failed tests when they occur
- Post: Continue to perform weekly manual testing
- Pre: Decide on subset of automated testing to run daily
Automation
- Pre: Receive and pass off token w/ RelEng (go and finish- could use Rabbit MQ for messages between the two automation harnesses)
- Pre: Hooking all these automated tests into a harness, and report back results - blocked on new harness, updating tests (Q2)
- Pre: Automating 4 manual tests (Flash video, Silverlight/Netflix video, Switch-to-tab, Downloading a file)
- Pre: Automating update testing with plugins/add-ons installed
- Pre: Creation of VMs/system images for profiles below
Automation (not blockers):
- Pre: Maybe automating a crash report test (not a blocker, we check for this as part of crash-kill)
- Pre: Maybe smoke testing really basic website functionality (should be covered by update testing in SoftVision)
- Sync Smoketest scenario would be; create an account, add a second client to that account then sync data between the clients. Mozmill can do the account creation/add client. TPS can do the data syncing. TPS can run Mozmill tests. However, we don't have this scenario currently automated.
Release Management
- Pre: List of platforms, plugins, and add-ons for update testing
- Pre: Talk to engagement about new quality bar
- Pre: Thinking about whether any top sites need to be tested live - can we get overnight QA test this?
- Pre: Crash reporting volume on a chron job?
- Post: we need to fix any test issues preventing automated daily sign-off
Addons & Plugin Automated Update testing
- Create profiles as outlined below and run automated tests checking updates & startup with these profiles installed.
- Based on information gathered here using these top add-ons.
- Bug on Token passing between QA/Automation & RelEng Automation (bug 764042)
- Suggested Profiles:
- Each of the 4 profiles should have several media plugins, one AV plugin, and a couple of toolbars
- Three of the profiles should have a mix of AMO and Non-AMO add-ons
- One of the profiles should have older versions of the add-ons
- Profile 1:
- Media Plugins: Adobe Acrobat Reader, Java Virtual Machine, Microsoft Silverlight, Adobe Flash Player, Quicktime (itunes integration)
- AV Plugin: AVG
- Toolbars: Skype, Ask, AVG
- Addons: AVG Safe Search, Skype, Java Quick Starter, Babylon, TestPilot, DownloadThemAll, NoScript, GreaseMonkey
- Profile 2:
- Media Plugins: Adobe Acrobat Reader, Java Virtual Machine, Microsoft Silverlight, Adobe Flash Player, RealPlayer
- AV Plugin: avast!
- Toolbars: Yahoo! Toolbar, SweetIM Toolbar
- Addons: Skype, Babylon, Adblock Plus, RealPlayer Browser Record Plugin, TestPilot, Download Statusbar, FlashGot
- Profile 3:
- Media Plugins: Adobe Acrobat Reader, Java Virtual Machine, Microsoft Silverlight, Adobe Flash Player
- AV Plugins: Norton, Kaspersky
- Toolbars: Norton Toolbar, Yandex Toolbar
- Addons: Norton IPS, HP Smart Web Printing, Video Download Helper, Kaspersky URL Advisor, IDM CC, DataMngr, virtualKeyboard@kaspersky.ru, Anti-Banner(Kaspersky), Easy YouTube Video Downloader
- Profile 4 (old versions of addons):
- Media Plugins: Adobe Acrobat Reader, Java Virtual Machine, Microsoft Silverlight, Adobe Flash Player, RealPlayer
- AV Plugins: McAfee, AVG
- Toolbars: Searchqu Toolbar
- Addons: McAfee Script Scan (Extension Version: <= 14.4.0), McAfee SiteAdvisor, Adobe Acrobat - Create PDF (1.1), RealPlayer Browser Record Plugin (15.0.4), DealPly (1.0.7), Java Console (6.0.32), AVG Do Not Track (12.0.0.2166)
Questions
- Do you care about live site testing? for instance, Netflix
- Where are "fake" updates staged (necessary for addons - compatible test) (probably releasetest as per bug 748815)
- What are the communication points that need to happen (what needs to be communicated and when between the QA & RelEng)
- What can we do to keep automation running as smoothly as possible with known/unknown failures or automation glitches? ie: we will still need troubleshooting from QA when automation cuts out due to errors
Blockers
QA
Previous Actions
- Clint to investigate sending cfg files as Pulse messages
- [MISSED] Anthony to consult RelEng about their current status and to get answers to questions
QA sign-off strategy
This is just a proposal and currently open to debate
- daily Mozmill automation results triage for regressions
- daily bug triage: qawanted > incoming > fixed
- stricter component watching/owernship for P1 bugs
- twice per-month Beta Bugdays
- twice per-month Beta Testdays
- twice per-month Feature sign-off
- twice per-month Compatibility sign-off
- utilization of core community members, give them more responsibility (feature ownership, bug triage, event participation, etc)
Dependencies
Need to evaluate these dependencies to determine which are hard blockers
- [ON TRACK] Improved stability and scalability of Mozmill automation
- [DONE] 3x Windows nodes in PHX
- [DONE] 3x Linux nodes in PHX
- [ON TRACK] 3x Mac nodes in PHX
- Pulse integration to notify when build is qualified and when updates/builds are ready to test
- We have this for Aurora & Nightly builds, assuming the same or similar requirement for Rapid Betas
- Buildbot integration
- Delayed due to RelEng/B2G resource priorities
- Not a firm requirement as we can do it partially automated like we do today (Pulse CI + on-demand CFG)
- How often do we expect to have to qualify updates?
- Can RelEng create an "updates available" Pulse signal?
- Mozmill 2.0
- Not a firm requirement as we lose nothing by sticking with Mozmill 1.5
- What does Mozmill 2.0 give us?
- Mozmill results in tbpl
- What does this give us?
- Integration of functional/update automation into RelEng workflow
- Not a firm requirement but we'd like to see us take steps toward this
Outstanding Questions
Questions needing answers so we can further develop a strategy and identify potential dependencies
- Breakdown of when sign off has uncovered new regressions in a beta? (and if it has, was it through automation or manual (human) tests?)
- If we find that sign-off has not found new critical beta regressions, can we start shipping betas on Wednesdays after automated testing runs through (prior to daily betas being implemented)? Bug verification, exploratory testing, and other smoke tests could continue out-of-band.
- Can automating the release test plan be done against nightly/aurora in preparation for moving to faster beta releases?
- Can we automate mobile update testing? If so, we can even make the weekly push to the Play Store automatic.
- What happens to the human mechanism of final sign-off before pushing live updates?
- What's our "update" story for rapid betas?
- How are people keeping updated?
- If it's in the background people still need to restart?
- Does this skew our data?
- How will QA be notified that a new beta build is coming?
- What is the mechanism/process to shut off updates?
- What happens when builds are out on weekends/holidays?
- How do we deal with things like patch tuesdays or new third-party releases?
- Can we host cfg files on hg somewhere?
- would give buildbot access to cfg files
- gives QA the ability to alter testruns by pushing changes to the cfg files
- no gating on RelEng/IT to make changes to our testruns
- hg.mo/qa/mozmill-automation is probably the sanest place to host the cfg files
- What is our current Nightly automation story and can it be scaled to Betas?
- What is the strategy for Stability triage and tracking?
Notes from 2013-01-17
- identify the minimum testing to evaluate a beta as "usable"
- once turnaround time is 2-4 hours we can start doing at least a couple Betas a week
- as manual tasks and handoffs are automated we can move more toward daily Betas
- the rest of the qualification (testdays, features, compatibility checks, triage) can happen out of band
- need a checklist to ensure all things are signed off by the end of the cycle
- Q1 goal: tuesday and thursday Betas -- what needs to happen from QA and Softvision
Support
Questions
- any pain points in supporting Aurora? (expand those into to-do items for supporting daily betas)
Blockers
RelEng/IT
Questions
- any issues with automated pushsnip to beta channel?
- Needs to be gated on QA. We already automatically push to betatest. Catlee 12:48, 13 April 2012 (PDT)
- can we add inputs from QA to incorporate the sign-off as part of automation?
- any possible issues with load (or mirrors) if we move to more frequent releases of beta on bouncer?
- Using the signing key appears to be a manual process currently, can this be automated?
- Already is! (except for android) - Catlee 12:39, 13 April 2012 (PDT)
- note: upcoming osx10.8 signing is explicitly excluded
- Can we do the work associated with daily betas be done in such a way that we could move back to weekly betas in the case of unforseen problems?
- Do we need to tag/package source for daily betas?
Blockers
- Tracking bug for RelEng work is bug 747168. Most notable items on it are:
- Automate android signing bug 705807
- Move master-side logic to client-side scripts: bug 748773, bug 748794, bug 748796, bug 748798, bug 748800
- Make mail notification work without a reconfig bug 749587
- Automate snippet optimization bug 748801
- Switch to using a different L10n url to get changesets for desktop: bug 751703, and keep the changeset info from each build in the manifest (to be created by releng as a replacement for tagging)
- unrelated but blocker: supporting new OS/platforms.
Crash-Kill Team & Soccorro
Questions
- What are the implications of switching to daily betas? How would crash analysis be different than m-c or m-a?
- Do we need to implement windowing over multiple build IDs to emulate having a large beta population on a single beta?
- What is a unit of 'enough' user testing data (days?) for getting feedback on a regression fix?
Blockers
- Scoping happening here
- Tracking bug 752956
- Requirements
- Implementation Plan
L10N
Answers/Resolutions
- Keep sign-offs on release and beta channel, continue without signoffs on aurora and central
- Drop the use of milestones at least for daily betas
- Get l10n-changesets from urls that give the current signed-off state for a product/version
- These urls need to updated on migration day every six weeks
- These urls are already existing, using the av= keyword argument with the right code instead of a ms= keyword
Questions
- What do we do with l10n milestones?
- Could we work towards automatically pulling from a beta branch? This doesn't really scale well.
- Could l10n_changesets be dynamically created from the csets on beta branch for locales in shipped_locales?
- Can l10n sign offs be automated? Test suites that check for anything other than string changes?
New questions:
- What happens with non-daily betas, notably TB/SeaMonkey? (No change, continue as usual)
- Do we need a "sealed" status for release, or is that good to go without milestones, too? (That is up to the needs of the l10n drivers, do you want a tagged repo every 6 weeks when a release goes out?)
Blockers
- None, but a releng bug on using the new url: bug 751703
Release Management
Update - RelEng (and lsblakk) met and discussed the complete process of releasing betas and have scoped the work involved in meeting this project's goals here
Questions
- What should we call beta nightlies (Rapid Betas)? A: Daily Betas (and we are not 'announcing' this like it's something new we're doing)
- How do we communicate beta nightlies?
- The beta release process is a useful tool for verifying the final release process. If we change the final release process, how will we verify the changes or bring new people up to speed without affecting the timing of the final releases?
- Do we need to build in a slightly larger window for the final release to allow for these types of issues?
Blockers
- Need to change the process of approving for Beta to first go through Aurora, for sake of risk mitigation
- We need to watch for more stretch across multiple versions of beta than we currently have across weekly betas
- Feature request: getting BuildIDs in Input to get more data on where feedback is coming from
- What's new page - either always on or always off but we'll need a better plan
Metrics
We did a sample analysis on the crash report data for beta for last 10 days of april. The 9 out of the top 10 crashers in the 1st hour remain in the top 10 of the last hour of the 10 day period. We probably will not be able to pick up enough reports to catch the rare crashes on day 1, so these will persist in subsequent builds and will be picked up later.
In summary: - Using Daily Builds we will be able to push bug fixes faster. - Rare signatures will be picked up later (once enough reports come in) and in this regard Daily Builds is not much different from weekly builds.
Product Marketing & PR
Update - Grace & Laura discussed; EJ added input; Laura discussed in channel meeting, wrote down answers.
Questions & Responses
- This will have an impact on how we could deploy a What's New Page which we've planning to do for sometime. We will likely still be able to, but we'll have to coordinate much closer on when and how those pages are turned off and on. What would be the details of getting a paged turned on or off?
- No clear answer on this. This is not a blocker, but something to keep in mind for the future to find a solution.
- How will this effect release notes? Will we update them every day? Once a week?
- Will keep doing release notes the way we do Aurora--stand them up with initial release and make minor changes as needed; don't think we will see major changes between first release and end of the cycle.
- How will this effect how we track comments/feedback on Input?
- Yes, it will. Will need a way to correlate comments to buildIDs.
- Will this effect how we track ADUs/ADIs with metrics tools?
- Yes, Alex is filing a bug.
- How will users be alerted to the updates? Will it cause update fatigue?
- No because this will only be implemented after silent updates.
- Is this the new way of doing things or a short term thing we are testing?
- Its the plan for the foreseeable future, but may be changed depending on resources, needs, etc.
- Does this mean new features can land in beta?
- It makes it less risky to land features, but any feature additions will be discussed in depth before landing; don't think this will cause an increase in the landing of new features.
- Do we have to update our communications around what to expect for each channel?
- I don't think it does, apart from a general blog post announcing the change.
- Will this change the release timeline?
- No, this shouldn't effect release timeline at all.
Blockers & Answers
- Blog Post communicating change on FoF; must be reviewed by PMMs and PR
- This will be done in conjunction with Alex Keybl, Mark LaVine/EJ and LMesa; timing tbd.
- Figuring out how to make release notes work for all audiences/communications
- Don't think this will be a huge concern; see above.
- This will impact if/when/how often we want to communicate about betas
- This is unlikely as the changes that will be done between betas will probably stay with bug fixes, security fixes or feature back outs.
- Updating all channel comms
- Unlikely outside of blog post mentioned above. Ej/LM/ML to discuss.
Project Status
Release Management
Bug # or Issue | Status | ETA | Contact |
Need to change the process of approving for Beta to first go through Aurora, for sake of risk mitigation | on track | ||
We need to watch for more stretch across multiple versions of beta than we currently have across weekly betas | on track | ||
Feature request: getting BuildIDs in Input to get more data on where feedback is coming from | bug 761781 filed | ||
What's new page - either always on or always off but we'll need a better plan | on track | ||
Compiling a list of addons & plugins to create automated tests for, to bump up the reliability of automated QA signoffs | on track | Lukas |
Tracked Bugs
Meta: bug 755978
Open
2 Total; 2 Open (100%); 0 Resolved (0%); 0 Verified (0%);
Resolved
ID | Summary | Priority | Status |
---|---|---|---|
598757 | Running out of space for symbols | P3 | RESOLVED |
714806 | Pulse message for nightly builds do not contain previous_buildid | P3 | RESOLVED |
727184 | Increase granularity of telemetry uptime measurement | -- | RESOLVED |
733591 | Send uptime with telemetry data | -- | RESOLVED |
747168 | [tracking bug] release automation support for daily betas | P3 | RESOLVED |
751703 | release_sanity.py needs to check locales/locale repos from a different url | P3 | RESOLVED |
752956 | [tracker] Add Rapid Beta support to Socorro | -- | RESOLVED |
753103 | Only bump the update xml's appVersion attribute once a week | P3 | RESOLVED |
754050 | need to implement mechanism to filter input for rapid betas | -- | RESOLVED |
756302 | Automate all applicable smoke tests to be able to scale to the daily betas process | -- | RESOLVED |
758101 | Re-enable background updates on mozilla-central | -- | RESOLVED |
761781 | Feature Request: Input.mozilla should get BuildIDs from users | P3 | RESOLVED |
771218 | Ongoing problems with Socorro staging | P1 | RESOLVED |
13 Total; 0 Open (0%); 13 Resolved (100%); 0 Verified (0%);
In-Product
Bug # or Issue | Status | ETA | Contact |
Background updates | landed in FF15, disabled in Windows for now | FF15 | rstrong |
Tracked bugs
ID | Summary | Priority | Status |
---|---|---|---|
307181 | Eliminate wait while restarting Firefox after update (apply update in background) | -- | RESOLVED |
727184 | Increase granularity of telemetry uptime measurement | -- | RESOLVED |
733591 | Send uptime with telemetry data | -- | RESOLVED |
754470 | Once we have rapid betas, we should change the beta channel to prompt users about a downloaded update once a week | -- | NEW |
4 Total; 1 Open (25%); 3 Resolved (75%); 0 Verified (0%);
Automation
Bug # or Issue | Status | ETA | Contact |
New test suites needing automation to bump up the reliability of auto-signoffs | waiting on Lukas for addons & plugins groupings for automated update testing & for QA to determine what else needs automation | ctalbert | |
Messaging/Token passing with RelEng for complete automation e2e | needs implementation decision by Automation/RelEng | ctalbert |
QA
Bug # or Issue | Status | ETA | Contact |
Increase automation coverage and integrate into build/push flow | waiting for confirmed plan of action, ETA | ashughes | |
Descoping of unnecessary "busy" work, scoping coverage increases, and risk analysis | waiting for confirmed plan of action, ETA | TBD | ashughes |
Tracked Bugs
ID | Summary | Priority | Status |
---|---|---|---|
756302 | Automate all applicable smoke tests to be able to scale to the daily betas process | -- | RESOLVED |
1 Total; 0 Open (0%); 1 Resolved (100%); 0 Verified (0%);
RelEng/IT
Bug # or Issue | Status | ETA | Contact |
Implementing daily check for beta repo builds (build when there are changes) | blocked on dealing with new OS/platforms | joduinn | |
Messaging/Token passing with RelEng for complete automation e2e | needs implementation decision by Automation/RelEng |
Tracked bugs
Meta: bug 747168
Open
ID | Summary | Priority | Status |
---|---|---|---|
764042 | Accept token from RelEng automation and send token back to trigger continued automation | -- | NEW |
1 Total; 1 Open (100%); 0 Resolved (0%); 0 Verified (0%);
Resolved
22 Total; 0 Open (0%); 22 Resolved (100%); 0 Verified (0%);
Crash-Kill Team & Soccorro
Bug # or Issue | Status | ETA | Contact |
Implementation of Socorro handling for daily betas | project is blocked on bug 771218 | 7/17 | Smooney/Laura |
Tracked bugs
Meta: bug 752956
Open
No results.
0 Total; 0 Open (0%); 0 Resolved (0%); 0 Verified (0%);
Resolved
ID | Summary | Priority | Status |
---|---|---|---|
755293 | [TRACKER] UI changes to support rapid betas | P1 | RESOLVED |
755297 | Database changes to support rapid betas | P1 | RESOLVED |
755304 | Audit FTP scraper cron job for rapid beta support | -- | RESOLVED |
773835 | New cron job list for Mobeta | -- | RESOLVED |
778255 | remove code/tests/cronjobs which use obsolete pre-mobeta tables | -- | RESOLVED |
5 Total; 0 Open (0%); 5 Resolved (100%); 0 Verified (0%);
Product Marketing & PR
Bug # or Issue | Status | ETA | Contact |
Updating all channel comms (blog posts, etc) | on track | lforrest | |
Updating release notes to work for all audiences/communications | on track | lforrest | |
Naming of this change (DONE: 'daily betas', this is not-news, we aren't doing anything marketing-wise for this) | on track | lforrest |
L10N
Bug # or Issue | Status | ETA | Contact |
No current issues (RelEng has the bug on automating the handling of l10n changesets) | on track | Axel |
Metrics
Bug # or Issue | Status | ETA | Contact |
No current issues | on track |
Support
Bug # or Issue | Status | ETA | Contact |
No current issues | on track | Cheng |
Legend
Healthy: work is progressing as expected. | |
Blocked: work is currently blocked. | |
At Risk: work is at risk of missing the targeted completion date. | |
ETA | Estimated date for completion of the current task. Overall ETA for this work is 7/17. |