Auto-tools/Projects/BugHunter: Difference between revisions

 
(48 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Team =
= Team =
Bob Clary (bc) - Responsibilities include schema design and implementation, system architecture, data generation, database/webserver administration... all things data'ish.
Bob Clary (bc) - Responsibilities include all things data'ish.


Jonathan Eads (jeads) - Responsibilities include webservice/UI design and implementation... all things web'ish.
Jonathan Eads (jeads) - Responsibilities include all things web'ish.


Mark Cote (mcote) - Responsibilities include admin webservice/UI... all things admin web'ish.
Mark Cote (mcote) - Responsibilities include all things admin web'ish.


= Overview =
= Overview =
Line 10: Line 10:
The purpose of Bughunter is to help detect bugs in mozilla software products and get them fixed.  Bughunter includes a data collection/storage system for managing meta data associated with firefox site/unit test data and a comprehensive UI for analyzing that data.
The purpose of Bughunter is to help detect bugs in mozilla software products and get them fixed.  Bughunter includes a data collection/storage system for managing meta data associated with firefox site/unit test data and a comprehensive UI for analyzing that data.


Bughunter data is separated into two top level categories: site data which includes testing firefox on specific URL's that generate crash reports and unit test data.  There are three types of meta data in these two categories: crashes, assertions, and valgrinds.  These data types are generated across a set of virtual machines that emulate a variety of different operating systems (MacOSX, Linux, Windows) and machine architectures in an effort to characterize a given bug's platform specific behavior.   
Bughunter data is separated into two top level categories: site data which includes testing firefox on specific URL's that generate crash reports and unit test data.  There are three types of meta data in these two categories: crashes, assertions, and valgrinds.  These data types are generated across a set of virtual machines that emulate a variety of different operating systems (MacOSX, Linux, Windows) and build/machine architectures (32/64 bit) in an effort to characterize a given bug's platform specific behavior.   


URLs associated with site crash data found in the [[Socorro]] database are pulled into bughunter in an effort to reproduce and further characterize crash reports by collecting additional metadata on different platforms.
URLs associated with site crash data found in the [[Socorro]] database are pulled into bughunter in an effort to reproduce and further characterize crash reports by collecting additional metadata on different platforms.
Line 17: Line 17:


All bughunter source can be found at http://hg.mozilla.org/automation/sisyphus
All bughunter source can be found at http://hg.mozilla.org/automation/sisyphus
= Design and Approach =
One of the goals of the bughunter webservice and UI system design, is to represent data generically enough to enable the addition of new data types by modifying JSON configuration files.  The system will likely be extended with different data types for different products, the UI was designed with this in mind, the impact of new data on the architecture and source code will be minimal.  The core concepts in the UI and architecture should stay the same.  The UI can represent data in tabular or graphical form, a set of controls are provided for filtering data, and connecting one data display to another.  Any data type can be connected to any other data type if they have a field in common.  This is referred to as signaling.
== Data View ==
The fundamental unit of data display in the bughunter UI is referred to as a "data view".  The default representation of a data view is tabular but can also be graphical.  An example data view is depicted below:
[[File:DataViewPanel.png|700px|center]]
The tabular data shown above is represented in a graphical form called a "Platform Tree" displayed below.  A data view can have any number of tabular or graphical representations.
[[File:DataViewVis.png|700px|center]]
== Data View Controls ==
A set of controls/filters are available for each data view.  Controls available on the control panel, depicted below, apply filters to the query associated with this view.  The date range is an exception to this rule, the date range set in the control panel is applied to the data view it is attached to and to any views that receive signals from that view. Every data view is constrained to a particular date range.  The date range of a parent view is always sent along with any signal.  This makes it easy to examine a date range with a collection of connected data views.
 
[[File:DataViewControls.png|700px|center]]
== Data View Navigation ==
A hierarchical menu is available on each data view.  This menu can be used to jump to any data type available.  Menu items can either be a single data view or a collection of data views.  When a collection is selected the page is cleared of all views and a set of data views, that operate together, are loaded.
[[File:DataViewNav.png|700px|center]]
== Data View Signaling ==
Signals can be sent from one data view to another when they have a field in common.  They are sent by clicking on a link in a tabular data representation or clicking on a selectable region of a graphical representation.  A data view can have any number of child data views that it sends signals to.  In the case of data coming from a relational database, where there is a SQL query that corresponds to the data view, the signal is typically rendered as some type of constraint in a WHERE clause in SQL.  The concept is not limited to RDBS databases but can be executed in the application when querying another webservice or NoSQL database.  This signaling allows a user to drill down on a complex set of data.  Data views can be displayed as multiple panes in a single browser window or spread out across multiple browser windows to make use of whatever screen capacity is available.
A collection of connected data views is displayed below.  The first view in the collection, "Site Crashes" can send signals to the two child views "Site Related Crashes" and "Site Crash URL Summary".  The signal sent/received is displayed in the middle control panel above the table display.  This allows a user to analyze relationships between these three different datatypes within a single browser window.
[[File:DataViewSignals.png|700px|center]]


= Architecture=
= Architecture=


=== Webservice ===
=== Webservice ===
The bughunter webservice serves data in JSON.  It excepts a set of named parameters provided in an HTTP POST that correspond to different data views.  An HTTP POST was used instead of a GET due to the potential large size of crash signatures and crash URL's.
The bughunter webservice serves data in JSON.  It excepts a set of named parameters provided in an HTTP POST that correspond to different data views.  An HTTP POST was used instead of a GET due to the potential large size of crash signatures and crash URL's that need to be passed as parameters asynchronously and on page load depending on the user action.


The complete source for the webservice can be found in python/sisyphus/webapp/bughunter/views.py.  This file contains two webservices: the admin service and the data view webservice.  The admin service manages reporting status for the VM cluster and the data view webservice provides a dataservice and UI for the sisyphus database.   
The complete source for the webservice can be found in python/sisyphus/webapp/bughunter/views.py.  This file contains two webservices: the admin service and the data view webservice.  The admin service manages reporting status for the VM cluster and the data view webservice provides a dataservice and UI for the sisyphus database.   
Line 30: Line 59:


==== Data Sources ====
==== Data Sources ====
A python module called datasource.py was used for all SQL/Database interactions (https://github.com/jeads/datasource).  Datasource provides an interface to MySQL that allows SQL to be stored in a JSON file with an associated name and host_type (master, read_only, etc...).  In order to send signals between data views portions of SQL had to be generated dynamically, this is managed by the datasource module to keep SQL munging out of the webservice and to provide a single location where all static SQL can be found (python/sisyphus/webapp/procs/bughunter.json).  This allows SQL statements to be treated as "stored procedures", all statements are assigned a name and are suitable for re-use by other scripts.
A python module called datasource was used for all SQL/Database interactions (https://github.com/jeads/datasource).  Datasource provides an interface to MySQL that allows SQL to be stored in a JSON file with an associated name and host_type (master, read_only, etc...).  In order to send signals between data views portions of SQL had to be generated dynamically, this is managed by the datasource module to keep SQL munging out of the webservice and to provide a single location where all static SQL can be found (python/sisyphus/webapp/procs/bughunter.json).  This allows SQL statements to be treated as "stored procedures", all statements are assigned a name and are suitable for re-use by other scripts.


==== JSON Config Files ====
==== JSON Config Files ====
Line 199: Line 228:


=== User Interface ===
=== User Interface ===
The javascript that implements the user interface is constructed using a page/component/collection pattern thingy... whatever that means.  This was found very useful in separating out the required functionality, below is a brief definition of what that means in bughunter.
==== Class Definitions ====
'''Page:''' Manages the DOM ready event, implements any top level initialization that's required for the page.  An instance of the page class is the only global variable that other components can access, if they're playing nice.  The page class instance is responsible for instantiating components and storing them in attributes.  The page class also holds any data structures that need to be globally accessible to component classes.
'''Component:''' Contains the public interface of the component.  A component can encapsulate any functional subset/unit provided in a page.  The component will typically have an instance of a View and Model class.  The component class is also responsible for any required event binding.
'''View:''' A component's view class manages interfacing with the DOM. Any CSS class names or HTML id's are defined as attributes of the view.  Any HTML element modification is controlled with this class.
'''Model:''' A component's model manages any asynchronous data retrieval and large data structure manipulation.
'''Collection:''' A class for managing a collection of Components or classes of any type.  A collection can also have a model/view if appropriate.


==== Class Structure ====
==== Class Structure ====
This is not a complete file or class listing but is intended to give a top level description of the design pattern thingy of the bughunter javascript and what the basic functional responsibility of the pages/components/collections are.  See the README for more details.


=== Database ===
BughunterPage.js
The sisyphus schema can be found here.
    BughunterPage Class - Manages the DOM ready event, component initialization, and
                          retrieval of the views.json structure that is used by different
                          components.
Bases.js
    Design Pattern Base Classes - Contains the base classes for Page, Component, Model, View etc...
 
BHViewComponent.js
    BHViewComponent Class - Encapsulates the behavior of a single data view using a model/view and 
                            provides a public interface for data view functionality.  Manages
                            event binding and registration.
    BHViewView Class - Encapsulates all DOM interaction required by a data view.
    BHViewModel Class - Encapsulates asynchronous server communication and data structure
                        manipulation/retrieval.


= Non-Goals =
BHViewCollection.js
    BHViewCollection Class - Manages operations on a collection of data views using a model/view
                            including instantiating view collections. 
                         
    BHViewCollectionView Class - Encapsulates all DOM interaction required by the collection.
    BHViewCollectionModel Class - Provides an interface to the datastructures holding all data
                                  views and their associated parent/child relationships.


''Anything of note that is specifically not going to be accomplished in this projectThe "what not".''
DataAdapterCollection.js
    DataAdapterCollection Class - Collection of BHViewAdapter class instances.
    BHViewAdapter Class - Base class for all BHViewAdapters.  Manages shared view
                          idiosyncratic behavior like what fields go in the
                          control panel and how to populate/retrieve them for
                          signaling behavior.
    CrashesAdapter Class - Derived class of BHViewAdapterEncapsulates unique
                          behavior for crash data views.
    UrlAdapter Class - Derived class of BHViewAdapter. Encapsulates unique behavior
                      for views containing URL summaries.


= Design and Approach =
ConnectionsComponent.js
    ConnectionsComponent Class - Provides a public interface for opening new views via events.
    ConnectionsView Class - Encapsulates all DOM interactions required by the
                            Open New View modal window.


''High-level design ideas and conceptsThe "how" in a general sense.''
VisualizationCollection.js
    VisualizationCollection Class - Holds a collection of classes that can
                                    represent data views graphically.
    Visualization Class - Base class for managing shared functionality between
                          data view graphics rendering classes.
    PlatformTree Class - Derived class of Visualization.  Renders tabular
                        data for Crashes, Assertions, and Valgrinds as a
                        circular tree.
 
=== Database ===
The sisyphus schema can be found at [[Media:Sisyphus_schema.pdf]]This pdf is not up to date but is useful in getting an idea of what data is available.


= Implementation =
= Implementation =
== Client ==
The following list of javascript packages were used as core infrastructure pieces in the bughunter client architecture.
* [http://jquery.com/ jQuery] - For DOM interactions
* [http://moo4q.com/ moo4q] - For OOP in jQuery.  All bughunter classes are built using this strategy.
* [http://datatables.net/ datatables.js]- This jquery plugin was used for all tabular display of data.  It's pretty awesome.
* [http://documentcloud.github.com/underscore/ underscore.js] - This javascript module was used for some algorithms/datastructures and maintaining function context in event binding... among other things.
* [http://thejit.org/ jit] - This data visualization javascript module was used for the Platform Tree representation.  It absolutely rocks for representing hierarchical/graph type data.
* [http://people.mozilla.com/~mcote/bughunter/BughunterFunctionalSpecification.pdf UI specification], This was the original functional spec that was developed at the beginning of this project.  It's mildly entertaining to see how it deviates from the final product.


''Technical notes, plans, and designs detailing how the project will be realized. The specifics of "how".''
== Webservice ==
* [http://nginx.org/ nginx] - Used as the web server.
* [http://www.fastcgi.com fastcgi] - Used for running django.
* [http://nginx.org/ django] - Used as the web framework.
* [https://github.com/jeads/datasource datasource] - Used for encapsulation and dynamic generation of SQL with MySQL.


* [http://people.mozilla.com/~mcote/bughunter/BughunterFunctionalSpecification.pdf UI specification], by jeads
== Database ==
* MySQL
Confirmed users
353

edits