Firefox/Channels/Postmortem/57

From MozillaWiki
Jump to: navigation, search

Post-mortem for the Firefox and Fennec 57 release Release Coordination vidyo room Tuesday, Dec 5th, 2017

Attendees: Sheila, Erin, Kevin, Marcia, Julien, Callek, RyanVM, TomGrab, Nicole, Ritu

Summary

List issues (good and bad) What went well? What didn't? What could we do better?

(syl) First run page super slow on old systems (3 fps) - bug 1417888

    • Wasn't tested on slow systems
    • not just old systems, the bug report says 100% of the linux population + about 20% of the windows population
    • Erin/Nicole can help follow-up, what kind of QA support do we need here? Can we add testing on slow systems to the existing test plan?
    • This was tested by SV as part of the onboarding flow: https://public.etherpad-mozilla.org/p/Onboarding_Experience
    • would need to add additional testing needed.
    • We might incorporate an update plan into the 58/59 time frame
  • (Ritu) Too many uplifts during Beta57 cycle - 576
    • 56 had 365 uplifts
    • 55 had 320
    • Great release quality despite the huge code churn
    • Was justified due to the scope of Quantum release!
    • Good to see more self-awareness
  • (Ritu) Nightly and Beta milestones helped teams plan better
    • Soft code freeze milestone was appreciated by eng teams
    • Add a WNP/first run related milestone
  • (Ritu) Each RC uplift was reviewed for second opinion by eng managers/component owners
    • may not be a scaleable process
  • (Ritu) More aggressive tracking via blocking flag
    • Good support from all on reviewing, investigating, fixing blockers
  • (Ritu) Amazing efforts to keep untriaged bug backlog within acceptable limits
    • may not be a scalable process beyond 57
    • Emma's dashboard will help us monitor this going forward
  • (Ritu) QA team's feature doc which contained list of features, status, blockers was very useful
  • (Ritu) Fennec triage could be improved
    • It's hard to make quick progress on blocking bugs
    • Difficult to help get second opinion on RC uplifts
    • Hard to get developer time to fix issues on the frontend
    • Romania, Taipei timezone cycle can be a bit slow, needing nudging from PST owners
      • need a resource that is a decision maker for bugs filed (crashes, blocking, etc.)
      • Fennec team could benefit from aligning better with Firefox processes
      • Fennec triage/component ownership needs to be shared with Firefox team
  • awareness of who knows what? resource list.
    • for now, ping snorp for platform, nevin for front end. dont know? defer to snorp
  • (Ritu) Awesome effort in planning and getting WNP ready during RC week
    • In the past this was a huge challenge
    • Hoping to get this new process become a repeatable thing for 58/59
  • (Ritu) Releng team's infra code freeze helped keep things smooth and predictable
    • Callek offered to ping catlee/jlund on whether this needs to happen going forward
    • This CF should have been better communicated out to release-drivers mailing list
  • (marcia) Engineering lead for the release worked well. Jim Mathies was very responsive and worked to keep the Wednesday triage under control. We should keep a lead in place for each future release.
    • Do we need a Fennec release eng lead as well?
  • (kbrosnan) laser focus on 57 caused a defocus on 58/59 maybe there should have been dedicated people who were focused on those releases (outside release management)
    • (marcia) To add to that, we need to be careful we don't carry nightly regressions into beta. In Fennec we had https://bugzilla.mozilla.org/show_bug.cgi?id=1413500 carried from Nightly into beta
    • (Ryan) Part of that issue was late landings due to not communicating out the milestones widely enough
    • More EPM support on future release planning and tracking
    • Cross functional milestones needed from relman team for future releases and keep the focus going
  • (sheila) - Dot release process - decision on dot release and timing really needs to be a joint decision from Product and Engineering. Fine for Release Management to make a recommendation. Document summarizing issues and status was great.
    • Relman team should continue reviewing dot release plan, fixes, user impact with senior management
  • (sheila) - More formal review process post release. Communication of non-decisions.
    • Relman team can build more visibility on issues that were reviewed before we decide unthrottling
  • (Ryan) Early rollout to DevEdition went well++
    • This was done to speed up the beta staged rollout
    • This helps weed out issues sooner and ship desktop beta builds faster
  • (kbrosnan) a lot of confusion about how Google Play does throttled releases
    • new installs are not guaranteed to get the most recent version. they have the same % chance to get the new version as previous installs
    • [julien] should we push the release candidate out earlier at low rollout %age?++
    • Fennec staged rollout at 10% on launch day was deemed too slow
    • Product recommended going at 25%. This helps us mimic the Desktop release throttling
    • Before the dot release is pushed out, the previous release should be at 100% staged rollout.
  • [elan] Shield
    • We can cover this in my 57 retro but putting ideas here are welcome:
    • Need a single dashboard!