Releases:Release Post Mortem:2016-02-24

From MozillaWiki
Jump to: navigation, search

Meeting Details

« previous week | index | next week »
< most recent | upcoming >


Release Duty

  • FF 45 cycle: mtabara

Misc

  • ESR45 branch set-up in progress

Shipped

Firefox 45.0b9 (bhearsum/nick/mtabara)

Issues, Build 1:

  • a couple of auto-retries in Fennec repacks, one for a terminated instance and one for out of space
  • Both Firefox Linux builds failed in symbol upload due to https://bugzilla.mozilla.org/show_bug.cgi?id=1250374
    • Had to be rebuilt manually
    • Got killed afterwards, because we want a build2.

Issues, Build 2:

  • Fennec push to mirrors failed because build1 had already run it.
    • Removed build1 from the CDN and replaced it with build2. This is less than ideal, because some things will have cached build1, but the CDN isn't our primary means of distribution so it shouldn't hurt to much.
  • The usual update verify fails related to Linux GTK.
  • AV builder failed with:
00:34:41    FATAL - IncompleteRead: IncompleteRead(0 bytes read, 40957406 more expected)
  • Retrigger worked. Probably should make this job more resilient (or maybe BeetMover already is?)
  • Replication issue between the Balrog rw and ro database caused delay in shipping. More details in bug 1250940

Firefox 45.0b8 (bhearsum/nick/mtabara)

  • TODO

Thunderbird_45.0b2 (jlund/rail/callek/nick/mtabara)

Issues:

  • win32 build step failed for Thunderbird 45.0b2 build1 - nthomas: "we needed to recreate AWS instances of windows builders, and this was collateral damage" - there was a windows slave outage in bug 1249499 so is probably related.
  • failed at repack_7/10 on win32 - release_repacks timed-out, retriggered.
  • failed at repack_7/10 on linux - intermittent cloning issue for mercurial due to some amazon certs. Automatic retry succeeded ~1h after
  • failed at repack_9/10 on win32 - two consecutive fails due to slaves dropping connection. I suspect it's a tail of the AWS windows builders instances recreation.
  • failed at repack_2/10 on win32 - the same story as 9/10 win32


Firefox 45.0b7 (nick/mtabara/rail/jlund)

mysql> select br.id, br.buildername, from_unixtime(br.complete_at) as complete, br.complete, br.results, substr(br.claimed_by_name,1,20) as claimed_by, from_unixtime(b.start_time) as start, from_unixtime(b.finish_time) as finish from buildrequests as br left join builds as b on b.brid=br.id where br.buildername like 'release-mozilla-beta-%_build' order by br.id desc limit 6;
+----------+-------------------------------------+---------------------+----------+---------+----------------------+---------------------+---------------------+
| id       | buildername                         | complete            | complete | results | claimed_by           | start               | finish              |
+----------+-------------------------------------+---------------------+----------+---------+----------------------+---------------------+---------------------+
| 98369068 | release-mozilla-beta-win64_build    | 2016-02-18 18:39:51 |        1 |       2 | NULL                 | NULL                | NULL                |
| 98369067 | release-mozilla-beta-macosx64_build | 2016-02-18 19:09:57 |        1 |       0 | buildbot-master84.bb | 2016-02-18 17:18:46 | 2016-02-18 19:09:57 |
| 98369066 | release-mozilla-beta-win32_build    | 2016-02-18 18:39:28 |        1 |       2 | NULL                 | NULL                | NULL                |
| 98369065 | release-mozilla-beta-linux64_build  | 2016-02-18 19:33:19 |        1 |       0 | buildbot-master74.bb | 2016-02-18 17:18:49 | 2016-02-18 19:33:19 |
| 98369064 | release-mozilla-beta-linux_build    | 2016-02-18 19:50:35 |        1 |       0 | buildbot-master74.bb | 2016-02-18 17:18:49 | 2016-02-18 19:50:35 |
| 97867785 | release-mozilla-beta-win64_build    | 2016-02-15 18:00:55 |        1 |       0 | buildbot-master70.bb | 2016-02-15 14:10:23 | 2016-02-15 18:00:55 |
+----------+-------------------------------------+---------------------+----------+---------+----------------------+---------------------+---------------------+
  • fixed with this sql to reset the buildrequest state, with the builds starting very quickly with the right buildID and tag properties
mysql> select id, buildername, complete, results from buildrequests where buildername like 'release-mozilla-beta-win%_build' order by id desc limit 4;
+----------+----------------------------------+----------+---------+
| id       | buildername                      | complete | results |
+----------+----------------------------------+----------+---------+
| 98369068 | release-mozilla-beta-win64_build |        1 |       2 |
| 98369066 | release-mozilla-beta-win32_build |        1 |       2 |
| 97867785 | release-mozilla-beta-win64_build |        1 |       0 |
| 97867783 | release-mozilla-beta-win32_build |        1 |       0 |
+----------+----------------------------------+----------+---------+

mysql> update buildrequests set complete=0, results=NULL where id in (98369068, 98369066) limit 2;
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0


Fennec 45.0b6 (nick/mtabara/rail)

  • awaiting the Google Play email to run the post-release and move this to the Shipped section
  • "Starting the build 6 without the last Hello changes. They broke m-b"

Roundtable

  • bhearsum: should we really be blocking shipping a chemspill release on what's new page configuration? I don't have opinion on this particular what's new page, but holding back in-the-wild fixes because of a what's new page seems bad

Context:

20:12:19 <bhearsum> a question for the postmortem, maybe: should we really be blocking shipping a chemspill release on what's new page configuration?
20:12:55 <rail> we should stop showing it 
20:12:59 — ~mtabara agrees
20:13:19 <rail> we are about to ship esr45 :)
20:13:27 <bhearsum> i don't have opinion on this particular what's new page
20:13:55 <bhearsum> but holding back in-the-wild fixes because of a what's new page seems bad
20:14:59 <lizzard> we’r only holding it back for a short time
20:15:03 <lizzard> but good question....
20:15:32 <bhearsum> yeah, and it's only for esr in this case
20:15:41 <bhearsum> i doubt it made a practical difference for that userbase
20:15:51 <lizzard> For esr, i can’t imagine enterprise folks can deploy this so quickly as to mind an hour’s diference
20:15:52 <bhearsum> but if were the firefox release channel it might be a different story
20:17:26 <bhearsum> i guess it's also an important point that screwing up the WNP has more effect on the release channel
  • mtabara: while deploying TB 38.6.0 with nthomas we had to change the balrog update rates to 50%. While attempting we realized they were already change and there has been some strictly UI issue in Balrog as rate changes did not shown on rule history - bug 1248475. nthomas did a db query to find the answers we were then looking for:
mysql> select change_id, changed_by, from_unixtime(substr(timestamp, 1, 10)) as timestamp, backgroundRate from rules_history where rule_id=170 order by change_id desc limit 10;
+-----------+---------------------+----------------------------+----------------+
| change_id | changed_by          | timestamp                  | backgroundRate |
+-----------+---------------------+----------------------------+----------------+
|      4423 | tbirdbld            | 2016-02-15 20:41:30.000000 |             50 |
|      3890 | tbirdbld            | 2016-01-07 22:21:11.000000 |             50 |
|      3889 | jlund@mozilla.com   | 2016-01-07 22:20:21.000000 |             50 |
|      3810 | jwood@mozilla.com   | 2015-12-30 16:03:00.000000 |            100 |
|      3740 | raliiev@mozilla.com | 2015-12-23 19:01:11.000000 |             30 |
|      3734 | tbirdbld            | 2015-12-23 15:18:47.000000 |             30 |
|      3733 | raliiev@mozilla.com | 2015-12-23 15:16:31.000000 |             30 |
|      3425 | jlund@mozilla.com   | 2015-12-02 18:54:06.000000 |            100 |
|      3378 | jlund@mozilla.com   | 2015-11-27 17:07:23.000000 |              0 |
|      3334 | tbirdbld            | 2015-11-25 18:58:35.000000 |             30 |
+-----------+---------------------+----------------------------+----------------+

Question for bhearsum: any chance I can get *read-only* access on that DB as well for future scenarios?

Ongoing

Action items