Socorro/Pre-PHX Smoketest Schedule: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(thoughts on test strategy)
(things to check)
Line 3: Line 3:
* What we are going to test and how in terms of load
* What we are going to test and how in terms of load
** what:
** what:
*** crashes are collected without error
*** all submitted crashes are collected
*** at what point do collectors fall over?
*** at what point do collectors fall over?
**** start with 40 test nodes at 1k crashes each, versus one socorro collector
**** start with 40 test nodes at 1k crashes each, versus one socorro collector
Line 10: Line 8:
***** back down nodes/crashes until we find a stable place  
***** back down nodes/crashes until we find a stable place  
***** check ganglia to see where our bottlenecks are
***** check ganglia to see where our bottlenecks are
*** crashes are collected without error
*** all submitted crashes are collected and processed
**** check apache logs for collector (syslog not reliable)
**** check processor and collector logs for errors
**** confirm that all crashes are stored in hbase
** how:  
** how:  
*** <strike>grinder ({{bug|619815}}) + 20 VMs ({{bug|619814}})</strike>
*** <strike>grinder ({{bug|619815}}) + 20 VMs ({{bug|619814}})</strike>

Revision as of 22:24, 10 January 2011

bug 619817

  • What we are going to test and how in terms of load
    • what:
      • at what point do collectors fall over?
        • start with 40 test nodes at 1k crashes each, versus one socorro collector
          • assuming/hoping this will overwhelm one collector
          • back down nodes/crashes until we find a stable place
          • check ganglia to see where our bottlenecks are
      • crashes are collected without error
      • all submitted crashes are collected and processed
        • check apache logs for collector (syslog not reliable)
        • check processor and collector logs for errors
        • confirm that all crashes are stored in hbase
    • how:
    • when:
      • waiting on deps in tracking bug 619811
      • tentative start date - Monday Jan 10 2010
        • minimum 2-3 days testing; as much as we can get
  • what component failure tests we will run
    • disable individual components to see test failure/recovery
      • hbase
      • postgresql
      • monitor
      • processor
      • others?
  • Outstanding issues
    • how many crashes to use?
      • pool of 500 random real-world crashes?