User:Mconnor/Current/Project Sisyphus

From MozillaWiki
Jump to: navigation, search
Draft-template-image.png THIS PAGE IS A WORKING DRAFT Pencil-emoji U270F-gray.png
The page may be difficult to navigate, and some information on its subject might be incomplete and/or evolving rapidly.
If you have any questions or ideas, please add them as a new topic on the discussion page.

Overview

Fixing busted trees and productivity-eating context switches to stare at webpages. That is the stone we've been rolling up the hill for years. We have the ability to fix this, using technology (omg!) and some leveraged approaches. This is not about solving problems for any particular group, this is about removing a major productivity sink across the project.

3195818623_06225cb663_m.jpg

Image via fouro on Flickr

What success looks like

mozilla-central is almost always open for business

  • 90% of pushes to m-c are green with no regressions
  • tree bustage is resolved in less than 30 minutes
  • tree is open 95% of the time.

Computers watch the tree, not humans

  • developers are notified of problems in their push, no user-polling
    • current sheriff(s) are also notified, as is dev-tree-management
  • All regression reports are tracked automatically, and someone (sheriffs) ensure resolution
    • If resolutions are not addressed, patch gets backed out
  • Known random oranges do not turn the tree orange unless they get worse

Managing the tree is easy

  • It is easy to close and reopen multiple trees
  • All closures are logged, data is easily available on reasons/type/duration

Sheriffs exist to help solve problems, not find them

  • Sheriff duty is a full-time job (when you're on duty) so that everyone else can focus on code, not tree state
  • Focus shifts from finding problems to being point on dealing with them
  • We have expanded coverage to cover 80% of pushes

How we get there

There is no silver bullet, just lots of hard work over a long period of time. But every bit helps.

Keeping mozilla-central green

Implement mozilla-inbound

Optional for now, will revisit later

Resolve backouts quickly

Note: raised on dev.planning, new proposal coming soon

Continue to encourage teams to adopt a project branch model

  • Some movement here.

switch onchange builds to non-PGO to catch problems dramatically sooner

bug 658313

Automating tree watching

Extend perf regression finder to mail pusher + sheriff

Use Pulse to notify pusher + sheriff on failures

Verify intermittent oranges are intermittent automatically

bug 657738

Tree management

Build a better tool for managing tree status

bug 630534

Build a regression dashboard to ensure that all perf/test regressions are tracked and addressed

Sheriff Evolution

Broader coverage

  • Get sheriff tool online
  • Get multiple shifts per day, to cover across timezones

Changed role

  • Sheriff is now point to:
    • merge mozilla-inbound
    • ensure regressions are backed out or bugs are filed, as appropriate
    • address bustage on the main tree