CrashKill/Correlations

From MozillaWiki
Jump to: navigation, search

About correlations and their importance in looking at crash data

Intro

the basic set of data in every crash report is listed below. fitting each item in the list in the pattern, context or numerical correlation of how often anyone of these items appears in a sample of crash reports helps to diagnose the problem and rule out things that are not important to reproducing the crash.

Crash Data

  crash signature
  url  --- due to privacy concerns we treat the specific site info about the place the user was visting as confidential
  uuid_url  -- link to the crash report  details
  client_crash_date  --    date_processed
       correlation of date to sets of crashes is interesting to us for events like
       https://bug519039.bugzilla.mozilla.org/attachment.cgi?id=403459
       and https://bugzilla.mozilla.org/show_bug.cgi?id=519039
  last_crash
       helps us to understand a bit about relability for each user
  product
  version
  build
  branch
        correlation of these helps us to understand which releases have the problem,
        and which don't given enough testing use
  os_name
  os_version
  cpu_name
          ditto for correlating to which kinds of systems and os'es might come into play
  address
          crashing address helps to correlate some of the signatures and stack traces
  bug_list
           ties of crash signatures to bugs where we track the problem
  user_comment
           all kind of user feedback we get when they submit crashes

then we have the module list of what was loaded into memory

and the stack trace. (the stack traces could be analyzed like signals to to better correlation of crashy code)

Modules, Addon, and Plugin Correlations

dbaron's amazing work has produced this

http://people.mozilla.com/crash_analysis/

to use it

pick a date


http://people.mozilla.com/crash_analysis/20091119/

then open up one of the files depending on which kind of corelation you are looking for and which version of firefox you are looking at.

for bug https://bugzilla.mozilla.org/show_bug.cgi?id=42771 we don't know what to suspect so we might have to look at several files, but the thing we will use to search is the stack signature

open the 20091119_Firefox-interesting-addons.txt file and search for js_GetGCThingTraceKind

to see:

 js_GetGCThingTraceKind|EXC_ARITHMETIC / EXC_I386_DIV (13 crashes)
     8% (1/13) vs.   0% (9/17895) {336dc353-5272-420c-84e7-ba1f3c9c2aeb}
     8% (1/13) vs.   0% (36/17895) lazarus@interclue.com (Lazarus: Form Recovery, https://addons.mozilla.org/addon/6984)

...

then see that there are no good addon correlations.

the left side is the correlation of this addon in the give stack signature.

the right side is the correlation of this addon across all signatures, for the product and times span of the report

So in this case, Lazarus: Form Recovery was found running in 1/13 crash rep for the js_GetGCThingTraceKind signature, and it was found running in 36 of the 17,895 total crash reports for all signatures. Similar for extension id 336dc353-5272-420c-84e7-ba1f3c9c2aeb -- ( often a google search can identify the what product that id corresponds to. we should have addon source search on-line to help in this too.

Similarly, you can then do the same checks for plugins, modules, and number of cores in the other reports in that directory, and you can check to see if correlations are the same or different for various product releases.