Build:Outage Policy

From MozillaWiki
Jump to: navigation, search

Build Infrastructure Outage Policy

Scheduled Outages

  1. All scheduled build infrastructure outages will be announced via IT's blog and DevNews. Included in the outage notice will be which services are going down, how long the window is, and the point of contact for the outage window.
  2. If Tier 1 Tinderboxen are to be taken down or affected during the move, a build engineer will be responsible for closing the tree and remaining available until the tree is to open again.
  3. During the outages, the (Point of Contact) PoC for the outage must be available to coordinate as necessary with IT, Web, or other teams to ensure that the services are restored to a production level (either rolled back as necessary or available) at the end of the outage window.
  4. The outage should be done during a time when PoC can remain available until the services are up at the end of the window. Thus, outage windows should not cause the PoC's to be up at odd hours for their timezones, etc.
  5. Maintenance outages will not be scheduled for Fridays.

Unscheduled Outages

  1. All unscheduled build infrastructure outages will be tracked via bugs.
  2. If a build engineer is required to repair part of the infrastructure (after IT assists with hardware issues and/or Tier 1 Tinderbox support), the bug will be assigned to that engineer.
  3. If necessary, the build PoC will be responsible for either closing the tree or coordinating with sheriffs to let them know when it's OK to open the tree.
  4. Bug 383742 has examples of periodic updates/ETAs that have been commented as being helpful.