Sheriffing/Test Disabling Policy

From MozillaWiki
Jump to: navigation, search

Policy for handling intermittent oranges

Identifying problematic tests

This policy will define an escalation path for when a single test case is identified to be leaking or failing and is causing enough disruption on the trees. Disruption is defined as any of:

  • Test case is on the list of top 20 intermittent failures on [Orange Factor]
  • It is causing oranges >=8% of the time
  • We have >100 instances of this failure in the bug in the last 30 days

Note: Whilst priority is tests which meet several of the conditions, meeting one is still sufficient to be be considered worthy of escalation.


Escalation is a responsibility of all developers, although the majority will fall on the sheriffs.

Escalation path:

  • Ensure we have a bug on file, with the test author, reviewer, module owner, and any other interested parties, links to logs, etc.
  • We need to needinfo? and expect a response within 2 business days, this should be clear in a comment.
  • In the case we don't get a response, request a needinfo? from the module owner

with the expectation of 2 days for a response and getting someone to take action.

  • In the case we go another 2 days with no response from a module owner, we will disable the test.

Ideally we will work with the test author to either get the test fixed or disabled depending on available time or difficulty in fixing the test. If a bug has activity and work is being done to address the issue, it is reasonable to expect the test will not be disabled. Inactivity in the bug is the main cause for escalation.

This is intended to respect the time of the original test authors by not throwing emergencies in their lap, but also strike a balance with keeping the trees manageable.


This is not intended to be perfect, but here are some common exceptions we should keep in mind:

  • If this test has landed (or been modified) in the last 7 days, we will most likely back out the patch with the test.
  • If we can identify a non test related change to the product by failure patterns and retriggers, we will push to patch the change or back it out.
  • If a test is failing at least 30% of the time, we will file a bug and disable the test first.
  • When we are bringing a new platform online (Android 2.3, b2g, etc.) many tests will need to be disabled prior to getting the tests on TBPL.
  • In the rare case we are disabling the majority of the tests (either at once or slowly over time) for a given feature, we need to get the module owner to sign off on the current state of the tests.