Releases/Thunderbird 3.0a3/Post Mortem
What went well
- The release in general
- Lots of Languages
- Copying downloads table from Firefox was easy and has given us an easy way to generate builds.
- Doing a dry-run brought out some problems in the build (even though we ended up not needing them for this release).
- More detailed release tracking wiki allowed better prediction of date when builds might be ready for QA, and better understanding for the entire team of where we were in the process.
- More detailed QA instructions caused more people to file bugs and not just leave comments in litmus = less follow up work.
What could have gone better, and lessons learned
Leading up to the release
- We didn't do a good job of explaning why we did alpha 3 not beta 1.
- We failed to see that beta wasn't going to be made earlier.
- Breaking the above points down:
- Didn't have clear enough goals in the first place, could have done the work earlier.
- Didn't do a good job of estimating work (e.g. autoconfig, gloda, better-faster imap).
- Easy to underestimate how long it takes to get things through review.
- Can things be landed (but turned off) so that people can test, whilst reviews are still in progress?
- See Actions.
- We were lucky that Firefox went into a string freeze when it did - otherwise we'd have had some extra strings for localisers to deal with. We need to consider this when planning future releases. We could potentially have a shorter string freeze before the code freeze, but extend the time after the code freeze. However ideally, Core would need to be frozen in some way.
- Suggestion next time is to use opt-in process where locales say which hg revision they wish us to use.
- We didn't think about the string-freeze early enough.
- Need to add to schedule a check for what UI bugs need to be done before the release.
- String freeze was very early considering release.
- Crash landings hurt
- Two announcements prior to string freeze - 2 & 1 weeks, to make sure people know.
- More education as to what the string freeze means.
- See actions.
- We still seem to be taking too long to actually cut the release after the freeze. Some of this has been compounded by the FF freeze/regressions, but we also took a long time fixing our bugs
- Cloning turned out to be a pretty good idea, and in the future, we could clone immediately after the code freeze, while things are baking.
- Or do named branches.
- and we could fix mozilla-central to try and ensure complete string freeze (note, this could cause issues for trunk development).
- Determining what to tag was a little difficult. We were lucky that FF had a simultaneous code freeze, otherwise picking an mozilla-central revision felt risky. (Actually, we did end up picking an arbitrary rev in Build2, and it turned out to be right in the middle of a apply-backout regression)
- On code freeze day we should expect to branch m-c and maybe c-c.
- Take any fixes we need on the branches.
- See what happens for localisers.
There are approx 18 open bugs for relnote in Mailnews Core and Thunderbird products. There are approx 67 in Core and Toolkit, but only two mentioned on Firefox 3.1 Alpha 2 release notes. We should encourage these to be tidied up on a regular basis.
Generally went well, without the intervention of MoCo's BuildEng team. However, a few things could have gone better.
- Picking a mozilla-central revision was too scary for me. We were lucky this time as there was a FF code-freeze.
- Tagging was a little tricky, and with the added complication that we needed to apply patches and then tag, I missed one patch and it forced a rebuild. In the future, it would be important to figure out a better way to signoff on the tagged bits (more eyeballs). Maybe after tagging, generating the source tarball and get that approved/looked at ?
- Cloning worked out okay, but will make it tricky for anybody who wants to rebuild that release (they need to know where to pull from, etc). At least one person already got caught by this. We need a way for these to be more easily discovered ? Wiki ?
- having to hg clone from each build machines, for each build was somewhat painfully slow. In the future, building from the source tarball is probably a better idea, plus it would be a safer thing to do, IMO.
- Keeping the wiki & the tracking bug in sync while going through the build steps had many problems. Duplication of content in many places, way too much cut-n-paste for my tastes, and much too error-prone. I'd like to find a better way to keep track of build processes as they happen.
- Suggestion is to put a link from the bug to the wiki page, and put all the content on the wiki page.
- This time around, the signing was the only possibly blocking step that required MoCo (apart from bouncer), and this time around, it was just a matter of getting some spare cycles from RelEng and once started, it got done without problems and quickly (still takes hours to sign all this stuff)
- l10n repacks can't be done in a universal build on mac and required a separate mini re-build. For the l10n repacks, on Win32 and Linux, I was just able to generate the l10n installers/langpacks straight from the built area, this doesn't work on OSX, so I had to effectively rebuild a ppc-only version just to repack l10n. This caused my some serious confusion, but also delayed getting the l10n installers on that platform
- Hopefully FF will fix some of these problems.
- Disk space. The build/release machines don't have enough space to hold dailys/nightlies/releases builds most of the time, so managing disk space was an annoyance. In the future, having more disk space or dedicated releases boxes (I prefer the first option) will greatly help this along.
- Mac Symbols. Works on PPC not on intel. Intel builds not working on server side. Need to try and fix it.
- QA completed 2-3 days faster than 3.0a2
- localizer involvement greater, hope for more next time
- more bugs filed against issues found - people had better documentation, and were more familiar with litmus
- getting volunteers, communications - thunderbird-testers mailing list helped
Still hard. Improvements:
- publicity/volunteers - getting testers geared up and faster results was still hard even with 48 hr notice and a weekend to test (we had Sunday). We gave 36 hr to thunderbird-testers mailing list (with about 35 people on at the time) and 3 days notice to 22 people with 3.0a2 tests results recorded in litmus. Potential ideas to pump it up for next time:
- more notices - like a count down est.14, est.7, 4, 3, 2, 1 days
- different announcement writers (who wants to hear from the same person on every announcement)
- incentive - like the first 1-2 person who finish l10n, BFT and smoke test get a prize
- daily updates/motivation during the x-day test period - do we have an official TB cheerleader?
- estimate we garner 1 tester per 10-20 volunteers
- QA Tracking bug for bugs found during litmus and baking - will make it easier to follow
- Time need to review tester results and sign off:
- track and record litmus results on a daily basis
- must remember to monitor crash-stats - separate person?
- Starting QA (parts?) earlier?
- 10day baking is long?
- overlap with baking/l10n builds/signing?
- Its slow and bad, some of our tests are out of date.
- Need improvements to Litmus for distributed litmus.
- Unlikely to see much happen - webdav people @ moco in charge of change).
- As they move to gristmill, less likely to see improvements.
- Two guide pages now written:
- Need to automate generating table for thunderbirdBetaDetails.class.php as part of the build process.
- AI:davida: Major items for beta 1 to be nailed down asap.
- AI:bienvenu: Pick an intermediate milestone for landing most of the big items.
- AI:sipaq: Document details on what should happen before, during and after the string freeze, and the effects on developers at the relevant times.
- AI:gozer: Ensure there is a bug for server-side Mac Symbols not working for intel builds.
- AI:wsmwk: Put up tracking bug for issues from releases during litmus tests.
- AI:wsmwk: Need to define what tests (e.g. smoketests/basic functional/unit tests etc) should go where and where they are filed.
- AI:standard8: Generate some gristmill example scripts, and encourage others to use gristmill and file bugs to help improve it.
- AI:standard8: File bugs for
- hook php generation into builds (file sizes/locales list)
- Check product-details interaction with mozillamessaging site.
- AI:standard8: Send mail to release-drivers, do we need some overall cleanup of relnote bugs?
- AI:tb-drivers: Go through current MailNews/TB relnote bugs and work out what is really necessary.
Standard8, dmose, davida, sipaq, gozer, wsmwk