From MozillaWiki
Jump to: navigation, search

The content of this page is a work in progress intended for review.

Please help improve the draft!

Ask questions or make suggestions in the discussion
or add your suggestions directly to this page.


Related Quarterly Goals

  • Metrics q2 goals (for background):
    • Replace NFS in production
    • Have cluster doing background processing of 100% of crash reports
    • Provide replacement for Postgres big table
    • [stretch] Developer API, likely to slide to Q3
  • Client team goal: Gather more information from crashes bug 528657


1.7: Hbase, part I

  • Get individual crash reports into Hbase
  • Begin rewriting pythonic middleware to support UI -> Hbase (transparent to UI at this stage)
  • OOPP hang reports supported
  • End of NFS

1.8: Hbase, part II

  • Daemonize processor/MDSW and run on Hbase worker nodes (architecture diagram coming)

1.9 Middleware API

  • Create an API to HBase/SOLR to replace most PostgreSQL queries in the webapp

2.0: new UI

  • Rewrite webUI to use new middleware
  • (stretch goal, may slip to 2.1) Implement a general purpose full text search. Should be able to search on any data associated with a crash, e.g any part of the stack trace and/or module list, any permutation or combination of field values

2.01 Cleanup

  • Post 2.0, let's do a clean up release to do a bunch of housekeeping
  • Perform a team survey of the unit testing landscape
    • Define unit testing needs
      • Hadoop
      • Python
      • PHP
    • Define integration testing needs
    • Define acceptance testing needs
  • Define unit testing strategy
    • Assign people to champion each area of testing
  • Perform a team survey of the documentation landscape
  • Better app monitors / business level monitoring
  • Subversion

2.(x+1) Trend Reports, part 1: Explosive bugs

  • Explosive Bugs Analysis
    • Automated detection of explosive bugs
    • First stage is bug 519423
    • PRD is needed here [chofmann/laura]
    • UX is needed here

2.(x+2) Trend reports, part 2: better correlations

  • Other cloud based correlation reports:
    • Between one report and other related reports: what are the logical correlatons? (PRD needed)
    • Correlation between any single piece of data and another (e.g. plugins, time, etc
      • Replace current correlations HACK with cloud version bug 554373


  • Draft goal: smarter analysis


These improvements shall be made over the course of Q2 (and likely continuing in Q3)

Better release process

Testing and QA

  • Better code review practices: commits to mailing list
  • Add QA to release cycle
  • See Test Plan for UI testing
  • More unit tests, more integration tests
  • Validate data sources against each other (e.g. bug 552539, bug 553144) - also look back at similar fixed bugs for test cases
  • Run tests automatically on checkin (Hudson?)


  • Write scripts for app level monitoring for IT to hook up to nagios
  • Implement "business logic" monitors: check things like hourly volume via webapp, db, etc
  • Expand application health [dashboard]
    • Some existing bugs on this. What granularity? What is "normal"?
  • [deinspanjer] Hbase monitoring to be expanded


  • Staging closer to production/more realistic
  • Perf/load test before deployment
  • Better access to staging for testing
    • Best:
      • database write access
      • ability to run scripts
    • Acceptable:
      • log viewing
      • database browsing
      • view config files
      • view automated test output (Hudson?)
    • Install/write some admin tools to accomplish this (may also be useful in production)