CrashKill/Plan/Priorities

From MozillaWiki
Jump to: navigation, search

This page covers priorities the CrashKill team forwards to the Socorro team.

"Published" Lists of Priorities

2013 and later

Priorities are nowadays tracked in a dynamic etherpad, and not any more on this page.

Q4/2012

List is in priority ranking.

  • React to Firefox OS requirements if needed - currently nothing known, bugs to be filed as necessary
  • Support new hang report format
  • Correlation reports need to become better (bug 642325 - 23/ASSIGNED, bug 650904 - Future/ASSIGNED).

Stretch goals / future items to keep in view:

  • New report: Rank compare - bug 640237 (Future/ASSIGNED, HTML/DB done)
  • New report: Explosiveness - bug 629062 (Future/ASSIGNED, HTML/DB done)
  • Based on ES deployment, shrink the Search bugs list - possible high-value targets:
  • New report: Plugin crashes with Flash version - bug 640241 (Future/ASSIGNED, HTML done)
  • New report: Devices - bug 687115 (untargeted)
  • New report: Components - bug 697581 (untargeted)
  • DLL Directory (bug 577613 - codename "Dragnet", code complete, awaiting hardware in bug 669398)
  • More items from "UItweaks" list - bug 612897 (Future), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

Q3/2012

List is in priority ranking.

  • React to Kilimanjaro requirements (most likely new products to be supported, like WebRT and/or B2G) - bugs to be filed as necessary
  • Ramp up support for Rapid Beta, which itself is targeted for early Q3 - bugs to be filed as necessary, though bug 672606 (untargeted) might easily be the key
  • Correlation reports need to become better (bug 642325 - Future/ASSIGNED, bug 650904 - Future/ASSIGNED).
  • New report: Rank compare - bug 640237 (Future/ASSIGNED, HTML/DB done)
  • New report: Explosiveness - bug 629062 (Future/ASSIGNED, HTML/DB done)

Stretch goals / future items to keep in view:

  • Based on ES deployment, shrink the Search bugs list - possible high-value targets:
  • New report: Plugin crashes with Flash version - bug 640241 (16/ASSIGNED, HTML done)
  • New report: Devices - bug 687115 (untargeted)
  • New report: Components - bug 697581 (untargeted)
  • DLL Directory (bug 577613 - codename "Dragnet", code complete, awaiting hardware in bug 669398)
  • More items from "UItweaks" list - bug 612897 (Future), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

Q2/2012

List is roughly in priority ranking.

  • React to Kilimanjaro requirements (most likely new products to be supported, like WebappRT and/or B2G) - bugs to be filed as necessary
  • Ramp up support for Rapid Beta, which itself is targeted for early Q3 - bugs to be filed as necessary, though bug 672606 (untargeted) might easily be the key
  • Integration of "external" Nightly crash trends report - bug 640238 (8/FIXED)
  • Add a URL list for people with permissions - bug 550538 (10/FIXED)
  • Correlation reports need to become better (bug 642325 - Future/ASSIGNED, bug 650904 - Future/ASSIGNED).
  • Based on ES deployment, shrink the Search bugs list - possible high-value targets:
  • Integration of "external" reports:
    • Rank compare - bug 640237 (Future/ASSIGNED, HTML/DB done)
    • Flash version - bug 640241 (16/ASSIGNED, HTML done)
  • Explosive crash detection (bug 629062 - Future/ASSIGNED, HTML/DB done) - implement the prototype algorithm in Socorro itself.
  • DLL Directory (bug 577613 - codename "Dragnet", code complete, awaiting hardware in bug 669398)
  • More items from "UItweaks" list - bug 612897 (Future), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

Q1/2012

Priorities are mainly to get the ones from previous quarters actually done.

  • Integration of "external" reports:
  • Correlation reports need to become better (bug 642325 - 6, bug 650904 - 6).
  • Explosive crash detection (bug 629062 - 5) - implement the prototype algorithm in Socorro itself.
  • Signature summary improvements
  • Get access for scripts to query the Socorro database (certain privs/hosts only) - bug 529946 (5)
  • DLL Directory (bug 577613 - codename "Dragnet", code complete, awaiting hardware in bug 669398)
  • More items from "UItweaks" list - bug 612897 (5), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

Q4/2011

For this quarter, the priorities are mainly to get the ones from previous quarters actually done.

  • bug 642336 (2.3.3/FIXED) - summarized overview for signature
    • should also provide entry point to getting a URL list - bug 550538 (untargeted)
  • Hang pairs (bug 637661 - 2.3.2/FIXED) - try to get this for non-throttled builds, match when match is available on throttled builds.
  • Integration of "external" reports: bug 640237 (rank compare) - 2.4, bug 640238 (nightly crash trends) - 2.4, bug 640241 (Flash version) - 2.4, bug 640242 (crashes-by-build list) - 2.4, bug 641487 (Per-OS top crashers) - untargeted
  • DLL Directory (bug 577613 - codename "Dragnet", code complete, awaiting hardware in bug 669398)
  • Correlation reports need to become better (bug 642325 - untargeted, bug 650904 - 2.4).
  • Content crashes (bug 578687 - 2.4/FIXED, some parts FIXED in 2.1)
    • We need "content" as a category in the topcrashers report - bug 688533 (2.3.2/FIXED)
    • The summarized overview from bug 642336 should include how many of those crashes are what process type - "content" being one of those.
  • Explosive crash detection (bug 629062 - 2.4) - implement the prototype algorithm in Socorro itself.
  • More items from "UItweaks" list - bug 626522 (2.3.2/FIXED), bug 612897 (untargeted), bug 554374 (untargeted), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

Q3/2011

  • Reports that are based on channel+build_id instead of only version - bug 540687 (2.2/FIXED, topcrashers), bug 657400 (2.2/FIXED, crashes per user)
  • bug 642336 (2.3) - summarized overview for signature
    • added in August: - should also provide entry point to getting a URL list - bug 550538 (untargeted)
  • Correlation reports need to become better (bug 642325 - untargeted, bug 650904 - 2.3).
  • Content crashes (bug 578687 - 2.3, some parts FIXED in 2.1)
    • Fennec already sends a significant portions of its crashes with a "content" process type. Not shown in "Browser" default view of topcrashers.
    • We need "content" as a category in the topcrashers report.
    • The summarized overview from bug 642336 should include how many of those crashes are what process type - "content" being one of those.
    • Crashes per user don't include content crashes, which might be a factor in why Fennec graphs look like it's very stable.
  • Explosive crash detection (bug 629062 - 2.3) - implement the prototype algorithm in Socorro itself.
  • More items from "UItweaks" list - bug 626522 (3.0), bug 612897 (untargeted), bug 657509 (2.2/FIXED), bug 554374 (untargeted), others as possible, easier ones and ones that affect common workflows (as discussed on all-hands) first

New reports - Open from Q2 list:

  • hang pairs (bug 637661 - 2.3) - try to get this for non-throttled builds, match when match is available on throttled builds.
  • Integration of "external" reports: bug 640237 (rank compare) - 2.3, bug 640238 (nightly crash trends) - 2.3, bug 640241 (Flash version) - 2.3, bug 640242 (crashes-by-build list) - 2.3, bug 641487 (Per-OS top crashers) - 2.3
  • DLL Directory (bug 577613 - codename "Dragnet", WIP)

Q2/2011

Socorro (stating targeted version and resolution where appropriate):

CrashKill:

  • Content crashes (bug 578687 - 2.2) - what do we need there? (Socorro team might want to work on backend side dependencies already, though)
    • Fennec already sends a significant portions of its crashes with a "content" type. Not shown in "Browser" default view of topcrashers.
    • We need it at least as a category in the topcrashers report
    • We probably want some indicator in overviews of what crashes are content
    • Include in bug 642336-type summarized overview
  • Explosive crash detection (bug 629062 - 2.1) - figure out if current experimental algorithm gives us what we want before handing over to Socorro team for implementation.

KaiRo's Thoughts on Priorities

General

Notes on what we or myself should look into when doing priority planning. Things might emerge from here into more specific lists.

  • Keep an eye on "integrity" list - potential problems with the data we analyze sound worrisome.
  • "UItweaks" list - even if this looks long, most of those should be *very* fast to fix and even though small, they improve efficiency of working with Socorro.
  • New reports - we need to be careful to go step by step on those and not require the team to do all at once, but there's a few potentially high-impact ones. Integration of currently external reports should be on top of the list, as well as "explosiveness" reports. See CrashKill/Plan/ReportTriage/2011-03-17#P1 for a pre-Q2/2011 analysis of those.
  • Current and near-term-future challenges - e10s, AppData analysis, new development channels.
  • Search needs to become more flexible. Need to identify specific items that are most pressing for users, some come out of comments I gathered.
  • Configuration of skiplist etc. needs to get simpler (bug 528390).

Open Questions

Open questions from assembling Q3/2011 goals:

  • Search? (i.e. once ElasticSearch is in, what do we still need there?)
  • AppNotes (or custom fields) analysis? (Does ES alone give us enough power to search through it? Does the summarized overview need something from this? bug 641461,bug 641467 are related to this)
  • Freeform notes on signatures? (Would it be helpful if someone who can log into Socorro can leave a note on a signature, which can be read by other users?)

Long-term Investigation

Those are things the CrashKill team needs to figure out and that might have impacts on priorities and goals.

  • How can we give feedback to users? Did PostCrash go far enough?

User Comments

  • wsmwk
    • most important:
      • bug 421119 function for socorro to compare stacks of two or more crash reports
      • bug 518823 indicate bug's status for bugzilla keyword topcrash
      • bug 578376 multiple crashes from a single person should have less weight then many crashes from different people
      • bug 411354 Add ability to search by build ID
    • also important:
      • bug 527304 provide smart analysis ala talkback
      • bug 512910 Make it easier to analyze crashes that share a signature
      • bug 528390 better workflow for updating skiplist
  • Smokey Ardisson
  • johnjbarton
    • Of course *all* of my crashes involve Firebug. The number one question I have when I visit crash-stats site is:
      • How many other users who have this crash also running Firebug?
        • If the answer is 95%, then I better spend some time on it because no one else will. If the answer is 5%, I'm having lunch.
        • Regarding the point about "correlations" page, I've never found that info to be useful, sorry. I have looked at it, but I don't recall every finding something useful. bug 642325
  • Jeff Muizelaar
    • It would be nice if it was possible to get more summary information about a crash. For example: What build ids does this crash all occur with? What operating system versions does this all occur with? etc. bug 642336
  • Josh Matthews
    • I, like Jeff, would appreciate summaries of the data available - most recent 10 unique build ids, list of unique OS versions, range of uptimes, etc. bug 642336
    • I would also be really interested in data about spikes - seeing a graph of the number of crashes for a particular signature over time would be useful to track trends. bug 640247, bug 629062
  • Honza Bambas
    • definitely voting for search also by other frames then just the top frame (the signature) bug 480503
    • (search by) regular expressions would also be useful, but not necessary (for me) bug 641483
    • when searching for regression ranges I sort the results by buildID, if there are several result pages, then going to another page forgets the sort-by option => would be great to show only the buildIDs the crash matching the query occurs in ; to include also the branch (e.g. tracemonkey) in the list would be helpful
    • direct access to the repository revision would be also helpful, but it is more about just changing the web UI, the information is already there, just makes more work to access
    • a link to the nightly build directory on the ftp server would also be good, but for me not necessary (I can always build from the given rev my self) ; If needed I look for a nightly by the .txt file that contains the buildID, really not convenient
  • dmandelin
    • When did this crash start happening, or start happening often? The graph view we have is very helpful, but it only shows 4 weeks at a time, which is not always enough. bug 640247 (bug 629054, bug 629062)
    • Search for crashes by crash address. Sometimes we add diagnostics that will make Firefox crash at certain addresses. We can't currently search for this directly in Socorro. There's a bug on file for this. bug 549443
    • Group crashes by various dimensions. Within a search, it would be nice to group the crashes by things such as "top 3 stack frames" (i.e., separate out crashes by the length-3 tail of the stack", crash address. There's a bug on file for this one, too. bug ???
  • bjacob
    • bug 641461 Make it easy to add new crashreport fields, or confirm AppNotes in its role as universal custom field (maybe just a documentation issue)
    • bug 641467 Make 'advanced search' be able to search in any field of crash reports
    • bug 641482 'Advanced search' should allow using boolean charts
    • bug 641483 Allow regular expressions in 'advanced search'
    • bug 641484 Document/clarify what 'Crashes Per Active Daily User' means
    • bug 641487 Per-OS top crashers lists
    • figuring what the top crashers are, checking that they have developers assigned on them
  • Ehsan Akhgari
  • Brian Smith
    • I would like a link to the zip file of the build and the necessary symbol files (if any) that corresponds to the crash, next to the link to the minidump.
    • Also, if it is a Windows crash, then I would like the minidump to have the "dmp" extension automatically. bug 541936
    • Basically, anything that can be done to get from the crash to debugging the crash in fewer steps would be great.
  • Mark Banner
    • Crash-stats explaining itself a bit more. bug ???
    • comparison of rankings between releases - have new crashes been introduced, have crashes been fixed? bug 640237
    • for these releases show me the crashes per ADU for the first y weeks of each of those releases normalised to the release date (with pretty graph and table). bug ???
  • LegNeato
    • Longer term, I am concerned with electrolysis and how we will report on and find problems between components. bug 578687
  • juanb
    • There are several questions that come to mind when looking at the site.
      • What are the top browser crashes?
      • What are the top plugin crashes?
      • What are the top add-on crashes?
      • What are the trends for individual signatures over time?
      • What are the top crashers/number of users in a platform?
      • What are the top crashers per locale?
    • I keep wondering more things, but I don't write them down. I'll write a few more.
  • Delfin Rojas
    • Is there a way to upload my [add-on's] symbols to Mozilla so the crash reports include my source? bug 419879