QA/Work Week July 2014/Data and Metrics Discussion

From MozillaWiki
Jump to: navigation, search

Discussion Notes: Slide Deck https://drive.google.com/file/d/0Byu0uOStN_ZcYVF1cld5MlE5Wkk/edit?usp=sharing A good metric is comparative. A good metric can be used to compare any set of data points and give you an idea of how things differed. It will also present the data in a means that is easily understood.

For example, "Increased ADIs by 10% from last week" is more meaningful than "We're at 50 million  ADIs."

This kind of metric can track your progress towards a goal.

A good metric is understandable. When you present you metrics to someone else, they should be able to immediately understand what that metric represents and what it means. If you have to spend time explaining the data and the forumlas behind how it was derived, you're metric is not a good metric.

A good metric is a ratio or a rate. If your metrics are comparative then they can be expressed easily in a ratio or a rate. If you collect a set of metrics over a period of time you should be able to determine long and short term tends and identify spikes.

A good metric is actionable. This is the most important part. A good metric is actionable. It will tell you that something needs to be done and it will reflect the success of failure of that action. It may not tell you the exact solution, but it should suggest a plan of action.

Reporting vs Exploratory: Reporting metrics are straightforward--they report on what's going on: Example: How many bugs ere filed this iteration Exploratory is the data you go looking for to explain the reports Danger of Metrics: You will find bugs where you look for them, the harder you look, the more you will find. Any metric that is used to determine personal performance or is tied to a reward will be gamed. (new cars for everyone!)

What questions do we want answered:

  • Find/Fix Rate
    • Opened and Reopened
    • Limited to a development cycle
  • Find/Deferred rate
    • Bugs that get moved to the next release
  • Bug Value
    • was it blessed as a blocker for example/ Priority/Severity
  • Where should we put effort
    • Test failures
    •  % of intermittents
    • Coverage of test suites
    • What are of products have the most risk
    • Bug age - the older the bug the less likely it is to get addressed?
    • Enhancements vs Other bugs
      • Filter out enhancements that are not being addressed
  • Keep track of the need into flags and qawanted flags
  • Team health metrics
    • time to response
  • How many bugs should we verify? - What bugs do we think/Need to be verified
    • Work load metrics
  • Bugs found in field
    • Counters the stats based on internal efforts
    • Crash stats also filter out need for user effort and is unbiased by effort or reporter
  • bugs filed on staff vs volunteers
  • Community building metrics
    • are we engaging more people?
    • are they effective?
    • where should be guide community efforts?
    • Contributor Time to Live
  • team health
  • Wish List
    • Heat map on test failures - what tests are code problems vs test problems
      • action plan - identify the tests that need to be investigated and determine how to improve reliability
    • Find/Fix rate by component
      • Action depends on the phase of the release
        • end of release - project stability
      • start of release - testing effectiveness
      • Used to determine if we have a problem in the release
      • Can be used as a Go/No Go flag for a feature
      • Gating criteria for a feature being released

- flags for attention

    • Regressions release over release by component
    • See above
    • Focus automation coverage and manual testing
    • If it is high, flags need for more integration testing
    • Contributor stats
      • [kthiessen] Swag-Tags: Comment or whiteboard with the text "::welldone:: <contributor name>" to label bugs/comments/contributions worthy of swag per week.
      • Rate of bugs from volunteers vs staff goes down - react by engaging community better.
        • Use staff as the control for level of community involvement
      • One metric for filed bugs and one for filed bugs that are marked fixed. (bug value)
        • for fxos this would be if the bug made it on to the blocker list.
      • Track dates - when as the last time a person contributed
        • If a long term / very active contributor has fallen off - follow up on what happened
        • Time invested in contributor vs time contributor invested in us. (Wish list!)
          • The more time we invest in contributors can indicate a lack in our documentation - not tracked on a individual basis
      • For WebQA - look at the github repos and measure over time of contributor pull requests

Create individual dashboards to allow contributors to track their progress

  • Need to be careful not to have people get scared off by fear of being judged
  • Use for gratification of the contributors efforts