Bugzilla Anthropology/2013-01-29
Bugzilla Anthropology January 29th 2013
ElasticSearch
ElasticSearch (ES) is a fast and scalable document store. Each Bugzilla bug is extracted, comments and titles removed, and inserted into ES as a series of JSON documents; each represeting a point on the bug history. This is done for all bugs, including security bugs.
Publicizing ES Data
Currently we are going through a security review to identify the changes required to publicize the ES data.
The academic community has an interest in Mozilla's rich bug repository
- Olga Baysal – University of Waterloo - Specifically looking at review times and how the rapid release cycle has change them: [1]
- Sean O'Riordain – PhD into the statistics of bugs in software at Trinity College, Dublin, Ireland
- David Eaves – Has interest in what motivates/discourages volunteer contributions at Mozilla: [2]
- Working Conference on Mining Software Repositories (MSR) – May focus on the Mozilla codebase in 2014 [3]
We assume the positive effects the BZ rest interface had on the number of tools can be amplified further given the speed of ES:
- David Bosewell maintains a [page of dashboards]
- [B2G Dashboard]
- [Firefox OS]
Current Work
Parallel Efforts
Despite what exists, there is distinct need for more tools to better manage the large number of issues BZ deals with
- B-Team – Working directly on BZ to improve it’s dashboards
- David Bosewell – interest from the community perspective: needs to measure the effectiveness of the community programs
- Liz Henry – Looking into bug triage practices
- Marco Mucci – Currently using Scumbugs to track Metro, but needs better tools
- 'UX Team – has settled on BZ for tracking bugs, but need tools to manage their work
- Release Engineering – Has hired an intern to produce operational dashboards: to sort tracking bugs by component/priority and assignee.
ElasticSearch (Technical Summary)
Despite ES's technical limitations, the current Javascript libraries give us both fast and expressive dash-boarding capability: Scanning 7million documents in sub-second time.
- ES Highlights
- Fast - Automatically indexed, and in memory.
- Scalable - Sharding assumes every document can stand alone
- Extensible - MVEL scripting language allows arbitrary code to be run on the server-side
- Limited Filtering - ES was designed for document search. BI queries require complicated filtering rules across multiple relations. ES' nested filters do not compare.
- Limited Grouping - ES is designed only from simple grouping and document counting
- Enhancments To Date
- Javascript library to convert SQL-like queries to ES/MVEL queries
- Javascript DB implementation to perform the joins and sophisticated calculations on client
- Further Issues
- Poor Stability - Need more human resources to identify and tune the existing ES cluster
- Cluster Too Small - The hardware is 5 years old, and never setup to run production queries
- Not centralized - Other projects are using ES instances, spreading the work over one large cluster will tame the relative usage peaks.