Bugzilla Anthropology/2013-01-29

From MozillaWiki
Jump to navigation Jump to search

ElasticSearch Summary

ElasticSearch (ES) is a fast and scalable document store. Each Bugzilla bug is extracted, comments and titles removed, and inserted into ES as a series of JSON documents; each represeting a point on the bug history. This is done for all bugs, including security bugs.

Publicizing ES Data

Currently we are going through a security review to identify the changes required to publicize the ES data.

  1. The academic community has an interest in Mozilla's rich bug repository
    • Olga Baysal – University of Waterloo - Specifically looking at review times and how the rapid release cycle has change them: [1]
    • Sean O'Riordain – PhD into the statistics of bugs in software at Trinity College, Dublin, Ireland
    • David Eaves – Has interest in what motivates/discourages volunteer contributions at Mozilla: [2]
    • Working Conference on Mining Software Repositories (MSR) – May focus on the Mozilla codebase in 2014 [3]
  1. Increasing Mindshare - We assume the positive effect the BZ Rest interface had on the number of dashboard tools can be amplified further given the speed of ES:

Current Work

Review Queues

Initial work focused on summarizing the review queues. The majority of the work was overcoming the technical limitations of ES.


BA Reviews.jpg

Open Bug Counts

Counting open bugs by program/product/component and team. Again, significant work required to keep it fast.


BA OpenBugs.jpg

Percentile Ages

Long term Programs can benefit from looking at the percentiles on the age of bugs over time. We can see the positive effects of focusing on security, while the negative effects of demphasizing Snappy:

BA Security.jpgBA Snappy.jpg

Operational Dashboards

Simpler, current, dashboards have been made to focus on particular issues:

  • []



Parallel Efforts

Despite what exists, there is distinct need for more tools to better manage the large number of issues BZ deals with

  • B-Team – Working directly on BZ to improve it’s dashboards
  • David Bosewell – interest from the community perspective: needs to measure the effectiveness of the community programs
  • Liz Henry – Looking into bug triage practices
  • Marco Mucci – Currently using [Scumbugs] to track Metro, but needs better tools
  • UX Team – has settled on BZ for tracking bugs, but need tools to manage their work
  • Release Engineering – Has hired an intern to produce operational dashboards: to sort tracking bugs by component/priority and assignee.

ElasticSearch (Technical Summary)

Despite ES's technical limitations, the current Javascript libraries give us both fast and expressive dash-boarding capability: Scanning 7million documents in sub-second time.

  • ES Highlights
    • Fast - Automatically indexed, and in memory.
    • Scalable - Sharding assumes every document can stand alone
    • Extensible - MVEL scripting language allows arbitrary code to be run on the server-side
    • Limited Filtering - ES was designed for document search. BI queries require complicated filtering rules across multiple relations. ES' nested filters do not compare.
    • Limited Grouping - ES is designed only from simple grouping and document counting
  • Enhancments To Date
    • Javascript library to convert SQL-like queries to ES/MVEL queries
    • Javascript DB implementation to perform the joins and sophisticated calculations on client
  • Further Issues
    • Poor Stability - Need more human resources to identify and tune the existing ES cluster
    • Cluster Too Small - The hardware is 5 years old, and never setup to run production queries
    • Not centralized - Other projects are using ES instances, spreading the work over one large cluster will tame the relative usage peaks.