Socorro:PRD Interviews

From MozillaWiki
Jump to: navigation, search

dbaron

  • Use cases
    • Top crashers -> details of particular crash
    • Admin UI to add versions
    • Constrain search by signature
    • CSV files
    • Rate of increase of a crash? (What URIs?)
    • Top URLs for a given crash
    • View user comments for a particular crash
  • Wishlist
    • Constrain top crash lists
    • When did a crash start? (search on both time and buildid)
    • Ad hoc queries
    • What percentage of crashes are caused by Flash? By extension X?
    • Faceted search
    • Map/Reduce
    • Explosive crashes - post to dev.tree-management (notification)
    • More correlations
    • Data in the minidump that is not available in the UI (some for privacy reasons)
    • Stackwalking code could generate better stacks if it had copies of the DLLs? (ask ted)
      • Better symbol coverage

damon

  • Use cases
    • Look at top crashers for new bugs (ones without bugzilla ids), fast rising crashes
  • Feedback/wishlist
    • Front page should always show current shipped version
    • Sudden crash patterns - email for sudden spike in shipped release to dev-tree-management
    • subscribe to explosive bugs (rss feed?)
    • Definition of explosive/critical bugs:
      • (initial) growth of more than 25 positions in the ranking
      • upwards change in rank and no related bugzilla id
      • time since startup < 1 minute
      • highlight these crashes in red or something
    • View all crashes - needs filter by hang/crash
    • Search:
      • Search by time since startup
      • Search for bugs with no bugzilla bug

jonas

  • Use cases
    • Top crashers: for each bug, what cause?
      • pull down minidump
      • open in debugger
      • classification
      • file bugs
    • Graphs (build time more useful than clock time)
      • Does a new version improve crashiness?
    • For each crash/bug, how are we doing?
      • Is the bug assigned?
      • Last commit date / last active bz date
      • Which component is it assigned to?
      • Which group? (third party/Mozilla)
    • Other scripts
      • Uses dbaron's correlation reports
      • Uses jst's script to pull down minidumps
  • Feedback/Wishlist
    • Find all crashes for a given signature where there is an email supplied
    • No correlation reports for startup crashes (no addon info for these) - need to google DLLs. Internal mapping of DLLs to addons would be useful.
    • Whiteboard notations in bz that socorro could pull in
    • Search all 3rdparty crashes / search mozilla only ("find all the bugs I can act on")
    • Group crashes by DLLs (one DLL may be responsible for several crash sigs)
    • Go from crashes to DLLs, DLLs to crashes, filter by DLL
    • Find new crashes easily
    • 3 categories of new crash
      • trunk/mozilla-central - broken checkins (low #s)
      • new dot release (early life crashes)
      • existing release starts crashing, typically 3rd party problem, or sometimes a rank change from low to many

chofmann

  • Use cases
    • release.next
      • look at top 300 crash reports
      • look at sigs without bugs
      • what do we need to do to get a good bug on file?
    • CSV
      • Identify correlations where we don't have a correlation report e.g. OS where we might most easily reproduce
      • What OS
      • What other versions does this crash appear in
      • If pre-existing, has frequency changed? ("volume regressions")
    • dbaron's correlations to plugins and addons
    • sanitize URLs and add to bug (if on public website)
    • look at time since startup - not many URIs, URIs not useful
    • within 30s is a startup bug, but anything in first few minutes is interesting
  • Wishlist
    • breakdown by:
      • OS
      • Fx version
      • Time from startup
      • look at individual crash reports
      • what other reports are like this?
    • correlations:
      • integrate addons and plugins
      • cron job to grab interesting data
    • search UI
      • search by each individual field in crash report - "like bugzilla search" - use a crash report template
    • search by DLL (also check out /CrashKill/dll-dictionary - can we crowdsource this?)
    • interested in map/reduce UI
    • DLL check, similar to plugincheck?

jesse

  • Use cases
    • bugs with stack traces
    • search for specific sigs, trying to find
      • frequency
      • correlations
    • Uses dbaron's text files
    • Always uses advanced search. Always uses "contains" not "is exactly" - this should be the default.


  • Feedback/wishlist
    • List of reports (for one signature)
      • Data not presented well. Most columns have identical data. Display could be more compact
      • When many results are returned (more than one page) - does sort sort all results, or just those on the current page?
      • Everything seems to be 0 or 100%?
      • Broken in Safari
    • Sort crashes by component
    • Suggest ways to narrow search (lt: faceted search)
    • Restrict search by caller (next thing on stack, nth thing on stack)
    • Personal adhoc skiplists
    • Write own Map/Reduce jobs - share them, elevate them
    • Change search length
    • List of reports for a sig is not useful
      • Should summarize the reports
      • Should have stats: 90% OS X, correlations, etc, how common?
      • Need to click through to an individual crash to get anything useful
      • Sort by stack trace, sort by caller
      • Highlight unexpected correlations
    • Developer discussions/comments per crash signature - this would be more useful than bugzilla, sometimes crashes do not map to an actionable bug report
    • Subscribe to a feed of new crashes in my component
    • Add names of extensions (AMO integration)
    • Give feedback to users, how to avoid crashes (bug 411425, bug 336872)
    • Query for exploitable crashes - jumps to random memory addresses (lt: not sure of what we would be matching against here) - instruction pointer and what it tried to access - would require changes to client and MDSW
    • Wanted bugs
    • Faceted search
    • Crashes by caller
    • Sort by Flash version

jst

  • Uses
    • Runs a script to extract data
      • Find new crashes in 3.6 not seen before, put in flat file
      • sigs, # crashes per release
      • ignore crashes with no sig, ones that happen in plugins
      • configurable thresholds
      • runs this nightly off desktop in the office
    • Does not use web ui now, did during crashkill push
    • Investigate reported crashes
      • Correlation data is critical
      • All correlations: addon versions, plugins, DLLs
    • Top crashers -> correlations
    • Loves graphs of frequency of a crash by date/build
  • Feedback/wishlist
    • Collect DLLs - if something is in 95% of windows 7 installs, call it official
      • Frequency count these
      • Then in correlations, mark as "Windows 7 DLL" (also for other windows versions)
    • Wrote script to pull 5 minidumps per sig, eval stacks - what does the rest of the stack look like? (callers)
    • Let anybody write their own queries

dmandelin

  • Uses
    • Not to *find* top/new crashes, but to fix them
    • Interested in top JS crashes, can mostly id these by signature (JS in there somewhere)
    • chofmann would find and file bugs, dmandelin would look at crashes to fix
    • 99% of use is trying to figure out cause of crashes
      • First thing: correlations: "it's Flash"
    • Another use: look at various reports with same sig, look for patterns:
      • Do they all have the same address? If yes, this is a strong hint.
      • Operating systems?
      • Uptime?
    • Next: look at individual stack traces
      • what method?
      • what LOC?
    • Then:
      • download a raw dump or two
      • can be tricky, but can plug dumps into Visual Studio, really need a disassembler to make sense of it all
    • Next: crashes by build (graph)
      • What is the first build where this crash appeared? (or spiked in volume)
    • Combine and summarize all this data
      • Pre-graphs, ran python scripts to pull down dumps and summarize
    • A special case: "hard bugs"
      • Add a patch (debug code) to make build crash in a different way
    • Search - advanced
      • Based on sig + Fx version + 1-2 weeks back
  • Feedback/wishlist
    • Graph by build, all builds side by side
    • When did it start (build time), when did it change freq (build time)
    • Crash spikes must be reported relative to ADUs otherwise meaningless
    • Search results page
      • Group by /sort by (tag)
      • Categorize - yes, faceted search would solve this
      • Aggregation is key to understanding the data
    • Easy switch between views without having to go back to front page/dropdowns
    • Full text search
    • Run own queries (yes, would like access to Map/Reduce UI)

tomcat

  • Uses
    • File bugs, crashes for each build, str
    • Rarely uses search
    • Try to develop test cases for each crash
    • Correlation reports - what addons are involved?
  • Feedback/wishlist
    • Which plugins? What versions are they? Are those versions up to date? (Tie in to plugincheck)
    • Explosive crashes
    • Are there bugs for the top 25 crashes?
    • Expose comments field
    • Current way of showing trends is good
    • Emailing users (bug 411425) - should do this, but how to do it right (talk to beltzner)
    • DLLs are interesting but may be too hard

marcia

  • Uses
    • dbaron's correlations
    • investigate new crashes - top crashes, mostly on trunk
    • file bugs for new crashes
    • dig for problematic addons
      • try to drill down by addon
      • would love search by addon
    • look for crashes in JS stack
      • find correlated URIs - these are usually easy to repro
    • find by URI
    • find by component
    • Individual stacks
      • On one OS: crashes, URLs, try to repro
  • Feedback/wishlist
    • Search is hard to use with signatures
    • Advanced search should be for longer window
    • Identify plugin names and versions
    • Pattern matching in stack and signature search
    • Match by component

juan

  • Uses
    • New to Socorro, works on OOPP
  • Feedback
    • Homepage releases should be per-user (as with other prefs)
    • Current throttle percentage should be shown (wontfix)
    • Need to be able to distinguish between builds as we approach GA - show buildid
    • Top changers - what do the numbers mean?
    • Compare top crashers for two most recent builds
    • Need visibility of extensions and plugins in human readable format
    • Aggregate crashes per ADU should be prominently displayed since it's a KPI

wsmwk

  • Uses
    • STR
      • Stack
      • Extension
      • Contact reporter by email - comments usually not enough - response rate here is 10-20%. Sends between 1-15 emails per signature, but selective - useful comments, written in English.
    • JS: lots of Tb problems - JS state info is sometimes missing (ask ted for more info?)
      • Time from startup is interesting
      • Probability of it being an extension?
    • File bugs on crashes without bugs
    • Why are there hardly any Mac crashes for TB?
    • With TB, little correlation between dev/beta crash rate and shipped crash rate
      • Different type of users, low volume
  • Feedback/wishlist
    • Correlations (and between releases)
    • Explosive crashes
    • Compare side by side stacks between two different crash types
    • Search - keyword + sig
      • should be contains by default
      • longer search range
      • longer "contains" in stack

ludovic

  • Uses
    • After a new release - what bugs have we missed?
    • New crashes without bugs
    • Crashes with increased volume
  • Feedback/wishlist
    • Remember TB cannot be throttled (too low volume)
    • When did crash first appear/spike?
    • What crashes are related to 3rdparty extensions/etc
    • Wants bug 411425