Buildbot/IT Unittest Support Document: Difference between revisions
< Buildbot
Jump to navigation
Jump to search
m (Buildbot/IT Support Document moved to Buildbot/IT Unittest Support Document: not clear enough) |
m (Full hostnames, please...) |
||
| Line 13: | Line 13: | ||
1. first check | 1. first check | ||
* check waterfall at: http://qm-rhel02:2005/ (mpt-vpn) | * check waterfall at: http://qm-rhel02.mozilla.org:2005/ (mpt-vpn) | ||
* see if slave is connected. | * see if slave is connected. | ||
* if so, click the machine name link, try a "Force Build" | * if so, click the machine name link, try a "Force Build" | ||
| Line 46: | Line 46: | ||
2d. Verify that the slave is connected | 2d. Verify that the slave is connected | ||
* check the waterfall at: http://qm-rhel02:2005/ | * check the waterfall at: http://qm-rhel02.mozilla.org:2005/ | ||
* the slave sometimes takes a couple of minutes to reconnect | * the slave sometimes takes a couple of minutes to reconnect | ||
* if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button) | * if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button) | ||
| Line 98: | Line 98: | ||
* restart slaves as above | * restart slaves as above | ||
** qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01 | ** qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01 | ||
* verify waterfall at http://qm-rhel02:2005/ is visible and slaves are connected | * verify waterfall at http://qm-rhel02.mozilla.org:2005/ is visible and slaves are connected | ||
Revision as of 20:17, 5 September 2007
Machines
master:
- qm-rhel02
slaves:
- qm-centos5-01
- qm-xserve01
- qm-winxp01
- qm-win2k3-01
list of steps to try
1. first check
- check waterfall at: http://qm-rhel02.mozilla.org:2005/ (mpt-vpn)
- see if slave is connected.
- if so, click the machine name link, try a "Force Build"
- fill out the name and reason fields, click the button
2. second check - restart slave
- login to machine using provided credentials
2a. Windows Remote Desktop (qm-winxp01, qm-win2k3-01):
- on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
- check the Task Manager to see if there are any stuck sh.exe and make.exe processes
- if so, reboot the machine using whatever means necessary (the stuck sh.exe and make.exe processes can make shutting down tricky. Kill them if you can)
- when machine is rebooted, log back in and open a command prompt
- restart buildbot:
- cd c:\
- buildbot start slave (command does not return)
2b. Mac OS X VNC (qm-xserve01):
- in the Terminal window, type "buildbot stop slave"
- note: message about not returning is expected on Mac
- if necessary, reboot machine (if it appears to be responding strangely. I think we've only rebooted it once or twice since incept)
- in a Terminal window,
- cd /builds/
- buildbot start slave
2c. Linux VNC (qm-centos5-01):
- in the Terminal, type "buildbot stop slave"
- reboot if necessary (never needed to yet)
- restart buildbot:
- cd /build/slave (should already be there)
- buildbot start slave
2d. Verify that the slave is connected
- check the waterfall at: http://qm-rhel02.mozilla.org:2005/
- the slave sometimes takes a couple of minutes to reconnect
- if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button)
3. clobbering manually
- Sometimes a machine will need to be "clobbered" (have its objdir directory removed)
3a. Windows Remote Desktop (qm-win2k3-01, qm-winxp01)
- login to the machine.
- stop the slave (as above, ctrl-C in the command window)
- check the task manager to make sure there are no errant sh.exe or make.exe files.
- If there are, kill them (End process on sh.exe) or reboot the machine
- from the Command Line:
- cd C:\slave\trunk_2k3\mozilla (C:\slave\trunk\mozilla on winxp)
- rd /s /q objdir
- restart the slave
- cd \
- buildbot start slave (command does not return)
3b. Linux VNC (qm-centos5-01)
- login to the machine
- stop the slave (as above, buildbot stop slave)
- cd /build/slave/trunk_linux/mozilla
- rm -rf objdir
- restart the slave
- cd /build
- buildbot start slave
3c. Mac OS X VNC (qm-xserve01)
- login to the machine
- stop the slave (as above, buildbot stop slave)
- cd /builds/slave/trunk_osx/mozilla
- rm -rf objdir
- restart the slave
- cd /builds
- buildbot start slave
- message that buildbot didn't return is expected
4. Restarting the Farm
- In the worst case, the entire buildbot farm needs to be restarted
- shutdown each slave as per the instructions above
- Ctrl-C in Command window on Windows
- buildbot stop slave in Terminal on Mac and Linux
- shutdown master on qm-rhel02
- cd /build
- buildbot stop master
- reboot qm-rhel02 and slave machines if necessary (stuck processes, strange behavior)
- restart master on qm-rhel02
- cd /build
- buildbot start master
- restart slaves as above
- qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01
- verify waterfall at http://qm-rhel02.mozilla.org:2005/ is visible and slaves are connected