Socorro/Hang Processing Proposal
The current system of processing plugin hang reports in crash-stats is not producing especially useful data, and with the introduction of Flash protected mode it is almost useless. This is a proposal by bsmedberg to radically change how hangs are processed in Socorro/crash-stats.
Current Procedure
The current procedure when a plugin hang occurs involves submitting two linked reports:
- One report processtype="" (browser) hangid="UUID"
- One report processtype="plugin" hangid="UUID"
These reports are submitted separately and are only linked by the client-generated hang UUID. The are processed separately and have separate signatures, and are cross-linked only minimally. Correlating things across both reports cannot be done directly in the SQL database and typically requires external processing.
Submitting All Minidumps In One Report
Instead of submitting separate reports, I would like to instead submit all of the information in one hang report. This may include two or more minidumps as well as a single metadata blob.
- Each hang report will generate a single signature (exact algorithm TBD, but probably focusing on the plugin-side stack at first).
- Each hang report will generate a single report ID in both SQL and hbase.
- Each of the minidumps will be stored in the same hbase row in a separate column:
browser_dump:dump plugin_dump:dump flash_dump1:dump (optional) flash_dump2:dump (optional)
Migration
Since the existing data is generally poor, we shouldn't worry about it too much, it's mainly used for large aggregate counting.
- Add processor support for submitting and retrieving all minidumps in a single report.
- Stop processing separate browser-side hang reports
- (optional) Do hbase magic to migrate existing hang pairs for non-release builds into the single hbase row using the child report ID
- (optional) Remove existing plugin-browser reports
- Add Firefox support for submitting both browser/p-c minidumps in a single report.
- Add Firefox support for submitting minidumps for the Flash processes NOTE: this means that most hang reports will end up with *four* minidumps needing to be processed and stored, instead of the current two.
- Add crash-stats frontend support for displaying multiple minidumps stacks