Breakpad/Status Meetings/2010-Jan-27

From MozillaWiki
Jump to: navigation, search

Agenda

1.4 Release

  • status update - open 1.4 bugs
  • testing today
  • deployment - Thursday 1/28
    • Do we require downtime?
    • Update SocorroUpgrade instructions. I'll be pulling IT request info from this wiki page.
    • griswolf to add daily url config to SocorroUpgrade

Out of process plugins

Metrics and HFS

  • Take with Daniel about
    • status of VM
      • IT bug [1] filed to clone one of our machines to serve this purpose. It will not be able to handle much data storage/processing, but it will be useful for testing integration development.
    • status of stage
      • We have a small staging cluster that could be pointed at from Socorro staging today, and the larger cluster will be ready for staging testing by the end of the week.
    • status of prod
      • Once we have performed integration testing on the larger cluster, we'll wipe it and reassign it as the production cluster.
    • dark launch 1 collector instead
    • ozten and davedash will work on hadoop collector change
  • Any blockers, next steps?
    • it request? for khan -> cm-hadoop01
      • I wouldn't have expected this to be necessary, but if Khan can't connect, IT should set up a route for connection to all the hadoop machines at least on port 9090.
    • We need C++ minidump stackwalk help badly. I e-mailed Ted and Morgamic, but we can put notes up here or elsewhere if useful.
    • Collector HFS integration suggestion - Add throttling config for perc of crashes to write to HFS

Next Actions

  • Daniel to talk to ted about Breakpad C++ help / performance
    • griswolf wants to work on C++
  • Daniel to talk to Aravind about getting Fx 3.6 crashes into HFS outside of the Socorro work to retain 100% 3.6 crashes

metrics Q1 goals

  1. 100% crashes in HFS
  2. Socorro reading crashes from HFS

For 1.5 we will fast track staging this code

1.5 planning

  • triage with chofman and damon TBD
  • UI/Hadoop big in this release
  • release date 2/23

CrashKill

  • no meeting 1/26
  • 526046 Generate better signatures for crashes related to unhandled Obj-C exceptions

Capacity for 3.6

  • Aravind, chofmann, and team need to meet ASAP
  • peak is 16k crashes per hour
  • Next 3.6 release will migrate 3.5 users, coming in 1 month or less
    • chofmann - do something in the next two weeks
  • between noon and 6pm we hit a backlog
  • Next step
    • Add more processors (why didn't we do this already)
      • We've had up to 9 in the past
    • Add more hardware
      • DB server doesn't have max RAM, buy 16 gig more
    • throttle back further?

If disk becomes an issues

  • lars mentioned client deferred storage in Firefox 3.5.4+ and hasn't been used yet


other stuff

  •  ?