CrashKill/Topcrash

From MozillaWiki
Jump to: navigation, search

The list of topcrash bugs is being kept up to date manually by the stability group by adding or removing the topcrash keyword on open (non-resolved) bugs according to the criteria below and the Top Crashers lists from Mozilla Crash Stats.

Top crash identification criteria

  • Firefox:
    • Top 20 desktop browser crashes on the latest release (once it is over 10M ADI).
      • The 20-30 mark is where the numbers start to drop below 2000 crashes per week.
      • Also, in the past, many of the crashes in the 20-50 ranges have been repeats of other signatures in the top 50. It's not an exact science here but we think it's important to pick some bar.
      • Anything appearing in the 20-30 range that is marked as a start-up crash is also tagged as a top crash.
    • Top 20 desktop browser crashes on Betas
      • This should be pretty much the same as release. Where we see discrepancies are really around 3rd party issues which are important to call out for blocking candidates.
    • Top 10 desktop browser crashes on Nightly, if they happen for enough different installations.
      • This might need some experience and feeling for what issues are important.
    • Top 10 content process crashes on Beta and Release
    • Top 5 gpu process crashes on Beta and Release
    • Top 5 rdd process crashes on Beta and Release
    • Top 5 socket and utility process crashes on Beta and Release but only if they affect 5+ installations.
    • Top 5 desktop browser crashes on Linux-, Mac-, and Win10- specific list on Beta and Release
      • If there's less than 5 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to only 2 or 3 installations.
      • If volume is very similar to the top 5, other bugs might still be included.
      • High volume Win7- and Win8- crashes should also be considered if they affect a significant number of users.
    • Hangs of various kinds are not always actionable so they probably shouldn't be flagged as top crashers unless their impact is significant.
  • Fenix:
    • Top 10 AArch64- and ARM-crashes for Nightly, Beta and Release
    • Top 5 AMD64- and x86- crashes for Beta and Release
  • Thunderbird:
    • Top 25 for Release
    • Top 3 for Beta, Nightly
      • Focus only on worst of the worse, and explosive crashes, and regressions because Beta and Nightly rankings don't correlate well** to final releases. So assigning topcrash status normally doesn't help significantly reduce release topcrashes. (** Probably because the respective user populations and environments are significantly different.)
  • Everything:
    • Bugs that spearhead investigation or fixes across a large collection of crashes
      • Judging this needs engineering expertise - if fixing a bug would clean out a number of crashes (with differing signatures) that would be in similar volume to signatures matching other topcrash criteria, that bug itself qualifies for topcrash as well.
    • Crashes for actions that users are rarely taking, even if they are somewhat out of the usual topcrash ranges
      • This needs feeling and expertise as well, thing like that can be e.g. printing crashes in the top 50 on desktop release and similar cases.
  • Each release channel (i.e., Release, Beta, Nightly) should be considered separately. Combining crash reports from multiple channels, e.g., beta and release might hide beta-only top-crashes.
  • Crash signatures that should not be automatically considered top crashers even if high volume:
    • All signatures starting with `EMPTY: ` or `OOM | large | EMPTY: ` (examples: `EMPTY: no crashing thread identified; HeaderMismatch` and `OOM | large | EMPTY: no crashing thread identified; StreamSizeMismatch`)
    • `OOM | small`
    • `IPCError-browser | ShutDownKill`
    • Signatures starting with `java.lang.OutOfMemoryError` on Android