Breakpad/Status Meetings/2010-May-19

From MozillaWiki
Jump to: navigation, search

1.7

  • bugs
  • Where are we with regards to freeze?

Other issues

  • Requirements interviews continue
    • Lots of small, pragmatic UI feedback
    • Post explosive bugs to newsgroup
    • Faceted search (drill down through correlations)
    • Map/Reduce UI
  • Laura blogged
  • Below is a very rough example of performing a map reduce task in JavaScript. The example data and code is based on Test Pilot rather than Socorro, but it should still serve to give a hint of what I'm aiming to deliver.
  • Mostly a question for Aravind: Would you rather have a pure simple HTTP request that you can map via Apache mod_rewrite to retrieve jsonz data or is a method inside our Pythonic middleware better? The http call would look something like this, with a tiny bit of munging of the crash report ooid needed to turn it into the right format for a rowkey:
 curl -vH 'Accept: application/octet-stream' http://server:8080/crash_reports/010051300376215-0dc1-496b-9a8a-cd9ef2100513/processed_data:json
  • CrashKill yesterday

JavaScript MapReduce example

I took some sample experiment submissions that Jono provided me in JSON format and imported them into a test instance of Riak.  The JSON format starts out with a list of the extensions the user had installed:

{"extensions":["fc917be449a36b9926b27668619b7e7a782ffafc","3c1ee0043e63b5174ff5f0982b8b56c52da941c3","dd2365e874e1b48259716ffea01a49419cdf49ce","4425721a49121f727e681a8182734f6a33e8f509","ebc83bda10442ffde107d205cc9bb43e047d5177","2b6cdd66493118cc98740764731948539b420bd5","b8b4f2634d1ab7b46142703dd5cb3f0f400048be","ab51f63e4c2a774d3cb5eaf7d3af085eb13b891f","216ee7f7f4a5b8175374cd62150664efe2433a31","eea7b54ebe099dc58d87c57cf1fd7fdd5505ffc6","99eef587368411375fd49646f731257ef6d4d109"],"location":"en-US","version":"3.5.3","operatingSystem":"Darwin Intel Mac OS X 10.5","contents":[.... 

The Map Reduce job below can return the results of the min/max/avg number of extensions for all the experiment submissions.  It could easily be tweaked to provide those stats grouped by operatingSystem, or to provide quartiles, or anything else we can think of.  We could also just have a job that doesn't do as much math and instead just spits out a CSV type result which could be slurped into Stata or R or Excel for analysis there.

// Map phase 
function(value, keyData, arg){ 
    try { // some of the sample data I was given was not valid JSON hence the try/catch 
        // Parse the JSON string into an object named "data" 
        var data = Riak.mapValuesJson(value)[0]; 
        var stat = data.extensions.length 
        return [{"count":1,"total":stat,"max":stat, "min":stat,"avg":stat}]; 
    } catch (e) { return [] } 
} 

// Reduce phase 
function(values, arg){ 
    return [ 
        values.reduce( 
            function(acc, item){ 
                acc.total += item.total; 
                acc.count += item.count; 
                acc.max = (acc.max < item.max) ? item.max : acc.max; 
                acc.min = (acc.min > item.min) ? item.min : acc.min; 
                acc.avg = acc.total / acc.count; 
                return acc; 
            } 
        ) 
    ]; 
}