The Canary Project

QA Risk Indicators

Project Canary is about identifying potential risks early in the development process so they can be analyzed and addressed before they become bugs in the release version. Ideally these indicators (canaries) can be reported on a wiki page/dashboard for easy reference and be used to help focus our testing efforts on release.

Step 1: Find the hidden tracking bugs that have been recently updated
Step 2: Identify unusual amount of code changes in any given release/file.
Step 3: Display this data in easily accessible page.

Bugzilla MediaWiki Extension Update

Add the ability to do client side filtering of data that can not easily be done with the existing REST APIs in the extension.

Added display type 'filter'
options
- filter_on='<field_name>'
- filter_op='<[gt,lt,eq,ne]'>
- filter_value='<field value or array size>'

Example: bugzilla display='filter' filter_on='depends_on' filter_op='gt' filter_value='3' This will only display bugs that have more that 3 depends_on bugs. These options are usable with any standard bugzilla extension query

Why modify the extension?

Not all tracking bugs are marked as such
Not all tracking bugs are equal.
- There are a lot of bugs with one or two depends on bugs. For the most part these are not representative of a significant amount of work being done.
- There is no means of limiting a Bugzilla query based on the number of tracking bugs

This extension modification allows for a more fine grained approach to querying data outside of that supported by the standard REST APIs used by the extension or in Bugzilla query interface.

HG-Metrics

Collects data from the mercurial repository for each branch (mozilla-central, mozilla-aurora, mozilla-beta and mozilla-release) and collects the data into a local SQL database (either SQLite or MySQL). It also collects data from Bugzilla for all bugs that have NOT been marked as invalid in some manner, (e.g. "invalid", "worksforme", "wontfix", "duplicate", "expired", "support").

The scripts process all of this data to and create a number of different stats each night.

Churn - The scripts then calculate average rate of change per file or 'churn' over its life time and the actual rate of change per release. This allows the scripts to identify those files with unusually high rate of change for a given release. This can be an indicator of areas of interest for additional testing.
Regression Rates - The Bugzilla data is linked with the commit data from mecurial to determine how many of the bugs fixed were marked with the regression keyword. This is then used to calculate the number regressions filed per number of lines changed per file. Files with higher rates of regresions can be an indicator of higher levels of risk for changes to these files and patches for these files should receive extra attention.
Backout Rate - Similar to the Regression Rate report, each check-in is checked for the backout flag and calculates how often changes to each file have to be backed out for any reason. This can be another indicator of potential instability or higher levels of risk for any changes made these files.
Team Stats - All of the above data is then collected and reported per team (determined using phone book data).

All of this data is then made available through a Media Wiki Extension which supports a number of different queries. The extension uses the tag "hgm" tag it indicate the JSON data block to be parsed. There are parameters that can be passed on the tag level to indicate how to display the data. The two primary views are 'table' and 'filter'. The table display type display all of the records returned from the query selected.

 <hgm display='table'>
 {
 "release":"beta-40",
 "minimum_change":"20",
 }
 </hgm>

If you wish to restrict the report to either include or exclude a particular set of data you can use the 'filter' display type and specify the type of filter to by used.

filter_on = the field name name to apply the filter too.
filter_op = the type of operation 'like', 'notlike', 'gt' (greater than), 'lt' (less than), 'ne' (not equal) or 'eq' (equal)
filter_value = the value to be matched against the contents of the filed filtered on.

The example below filters out all file names that contain the word 'test' from the query

report = type of report to display.
- 'churn' - rate of change report for given release. (default report if not specified)
- 'file_regression_history' - basic regression/backout rate report
- 'team_regression_history' - team stats

There are two basic types of report 'churn', the default, or 'history' which has some additional sub report types.

Each report type has a set of other parameters that specify exactly what the report should look like.

Churn Report Options

- 'release':'<branch name>' This accepts standard SQL wildcards '%' example 'nightly-%' will return data for all of the nightly releases.
- 'minimum_change':'<int>' This determines the minimum deviation above the standard rate of change for each file to included in the report
- 'latest':<'true'|'false'> If this flag is set and is TRUE it will attempt to find the most recent release for the given branch. So if release was 'beta' and latest is TRUE it will display the most recent beta branch. If it is false it will display all data that matches for the release name.

<hgm display='filter' filter_on='file_name' filter_op='notlike' filter_value='test'>
 {
 "report":"churn"
 "release":"beta%",
 "minimum_change":"20",
 "latest":"true"
 }
</hgm>

History Reports

File Regression History Report
- "title":"File Regression Rate" - Text displayed at top of report
- "group":"file_id" - Field name used to group the data by - file_id will group data by files
- "order":"regression_rate DESC" - display order. This is a SQL order clause added to the query. You specify either "DESC" for descending order or "ASC" for ascending order for the field specified.

<hgm type='history' display="table" >
{
 "title":"File Regression Rate",
 "report":"file_regression_history",
 "group":"file_id",
 "order":"regression_rate DESC"
}
</hgm>

File Back Out Rate Report

This report is almost identical to the regression history report with just a different sort order to highlight the important data. <hgm type='history' display="table" >

{
 "title":"File Backout Rate",
 "report":"file_regression_history",
 "group":"file_id",
 "order":"backout_rate DESC"
}
</hgm>

team_regression_history
- group - how do you want the teams grouped by manager is the default
- order - what field do you want to be the sort order of the data
- include_fields - this can be used to specify exactly which data fields you would like to display for this report

<hgm type='history' display="table">
{
 "include_fields":"manager,department,lines_changed,regressions,regression_rate,backouts,backout_rate",
 "title":"Regression History Test Chart 2",
 "report":"team_regression_history",
 "group":"manager",
 "order":"regression_rate DESC"
}
</hgm>

Examples

Churn Report

<hgm display='filter' filter_on='file_name' filter_op='notlike' filter_value='test'>
{
 "report":"churn"
 "release":"beta%",
 "minimum_change":"20",
 "latest":"true"
}
</hgm>

File Regression Rate

<hgm type='history' display="table" >
{
 "title":"File Regression Rate",
 "report":"file_regression_history",
 "group":"file_id",
 "order":"regression_rate DESC"
}
</hgm>

File Backout Rate

<hgm type='history' display="table">
{
 "title":"File Backout Rate",
 "report":"file_regression_history",
 "group":"file_id",
 "order":"backout_rate DESC"
}
</hgm>

Team Stats

<hgm type='history' display="table" >
{
 "include_fields":"manager,department,lines_changed,regressions,regression_rate,backouts,backout_rate",
 "title":"Regression History Test Chart 2",
 "report":"team_regression_history",
 "group":"manager",
 "order":"regression_rate DESC"
}
</hgm>

Back end data collector hg-metrics on github
- Python script that collects the hg logs and analyzes the data which it then stores in a sqlite database.
MediaWiki extension for querying the database and display the results based on user specified parameters
- Based on the bugzilla mediawiki extension

The Canary

Currently the above reports can be found on: http://10.252.28.88:8888/wiki/index.php/Main_Page/RiskReport

QA/Firefox/Risk Report

Contents