Socorro/Pre-PHX Smoketest Schedule: Difference between revisions
< Socorro
Jump to navigation
Jump to search
(things to check) |
(crashes figured out) |
||
| Line 17: | Line 17: | ||
*** Lars added stats and iteration to submitter.py for initial smoke-test {{bug|622311}} | *** Lars added stats and iteration to submitter.py for initial smoke-test {{bug|622311}} | ||
*** 40 seamicro nodes standing by to test, using [https://bug619814.bugzilla.mozilla.org/attachment.cgi?id=502200 socorro-loadtest.sh] | *** 40 seamicro nodes standing by to test, using [https://bug619814.bugzilla.mozilla.org/attachment.cgi?id=502200 socorro-loadtest.sh] | ||
*** pool of 240k crashes, taken over 10 days from MPT prod (Jan 1st through 10th) | |||
** when: | ** when: | ||
*** waiting on deps in tracking {{bug|619811}} | *** waiting on deps in tracking {{bug|619811}} | ||
| Line 28: | Line 29: | ||
*** processor | *** processor | ||
*** others? | *** others? | ||
Revision as of 18:20, 12 January 2011
- What we are going to test and how in terms of load
- what:
- at what point do collectors fall over?
- start with 40 test nodes at 1k crashes each, versus one socorro collector
- assuming/hoping this will overwhelm one collector
- back down nodes/crashes until we find a stable place
- check ganglia to see where our bottlenecks are
- start with 40 test nodes at 1k crashes each, versus one socorro collector
- crashes are collected without error
- all submitted crashes are collected and processed
- check apache logs for collector (syslog not reliable)
- check processor and collector logs for errors
- confirm that all crashes are stored in hbase
- at what point do collectors fall over?
- how:
grinder (bug 619815) + 20 VMs (bug 619814)- Lars added stats and iteration to submitter.py for initial smoke-test bug 622311
- 40 seamicro nodes standing by to test, using socorro-loadtest.sh
- pool of 240k crashes, taken over 10 days from MPT prod (Jan 1st through 10th)
- when:
- waiting on deps in tracking bug 619811
- tentative start date - Monday Jan 10 2010
- minimum 2-3 days testing; as much as we can get
- what:
- what component failure tests we will run
- disable individual components to see test failure/recovery
- hbase
- postgresql
- monitor
- processor
- others?
- disable individual components to see test failure/recovery