Breakpad/Status Meetings/2011-Jan-26

From MozillaWiki
Jump to: navigation, search

Migration

  • Wow. Every person on this call (and some that are not) did a stellar, stellar job.
  • Follow up items?
    • We need a data ageing policy for the current HBase and PostgreSQL data.
    • Monitor performance needs radical improvement
    • Ongoing improvements in monitoring the system
    • Connection cleanup for PostgreSQL connections ... 1.7.7 or 1.7.8?
    • Discuss steps required for "full throttle" processing.
HBase config tuning
Visited StumbleUpon yesterday and spent a few hours going over our PHX cluster. Came out with a series of useful recommendations for changes we should try out to improve the performance and robustness of the cluster. Big question for Socorro is how we can test these changes. We can take a copy of the data we have and load it up on the secondary cluster with config changes, but some of the changes will only be measurable with new traffic flowing in. Wondering if we can fork collectors over to the other cluster or spin up some collectors on my spare boxes or something else. While we are currently running fine, some of these changes are likely to be important for continued stability over the next couple of months.
Archived data removal
Want to delete the data we archived back in June. Also want to look at implementing the deletion of raw dump data that has not been accessed in the last 6 months. When can we finalize a plan for these two things? Bonus, yay, we can actually test the effects using secondary cluster.

Staging

  • Following boxes can be repurposed for staging. How to do it?
    • pm-app-collector01
    • pm-app-collector02
    • pm-app-collector03
    • pm-app-collector04
    • pm-app-collector05
    • pm-app-collector06
    • dm-breakpad-stage01
    • pm-collector-stage1 (VM, I vote to get rid of it)
    • dm-socorro-stage01 (could go to prod graphs-server)
    • cm-breakpad02
    • cm-breakpad03
    • cm-breakpad04
    • dm-bp-mware01
    • tm-breakpad01-master01
    • dm-breakpad-devdb
  • Metrics needs hardware. How soon can I reclaim some of the SJC production machines for other purposes? We will still earmark at least 10 to 15 machines for large scale Socorro staging.

1.7.7

  • Theme: ship UI improvements for Platform
  • bugs
  • Freeze: 2/17???
  • Ship: 2/24???
  • These dates can be a week later
  • Main features:
    • Explosive bug tracking (all)
    • Work on duplicate crashes (all)
    • Post migration tidying up (all)
    • Better safety for postcrashemail (rhelmer)

1.7.8

  • Just FYI
  • Ship by end of March
  • Theme: architectural improvments
  • Main features:
    • Monitor improvements (lars)
    • MDSW binary symbols (ted)
    • On crash emails (rhelmer)

Other issues