Buildbot/IT Unittest Support Document: Difference between revisions
< Buildbot
Jump to navigation
Jump to search
| Line 11: | Line 11: | ||
== list of steps to try == | == list of steps to try == | ||
If #1 doesn't work, try #2. If that doesn't work, try #3. If #3 doesn't work, contact bhearsum or robcee. | |||
1. first check | 1. first check | ||
Revision as of 02:24, 13 October 2007
Machines
master:
- qm-rhel02
slaves:
- qm-centos5-01
- qm-xserve01
- qm-winxp01
- qm-win2k3-01
list of steps to try
If #1 doesn't work, try #2. If that doesn't work, try #3. If #3 doesn't work, contact bhearsum or robcee.
1. first check
- check waterfall at: http://qm-rhel02.mozilla.org:2005/ (mpt-vpn)
- see if slave is connected.
- if so, click the machine name link, try a "Force Build"
- fill out the name and reason fields, click the button
2. second check - restart slave
- login to machine using provided credentials
2a. Windows Remote Desktop (qm-winxp01, qm-win2k3-01):
- on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
- check the Task Manager to see if there are any stuck sh.exe and make.exe processes
- if so, reboot the machine using whatever means necessary (the stuck sh.exe and make.exe processes can make shutting down tricky. Kill them if you can)
- when machine is rebooted, log back in and open a command prompt
- restart buildbot:
- cd c:\
- buildbot start slave (command does not return)
2b. Mac OS X VNC (qm-xserve01):
- in the Terminal window, type "buildbot stop slave"
- note: message about not returning is expected on Mac
- if necessary, reboot machine (if it appears to be responding strangely. I think we've only rebooted it once or twice since incept)
- in a Terminal window,
- cd /builds/
- buildbot start slave
2c. Linux VNC (qm-centos5-01):
- in the Terminal, type "buildbot stop slave"
- reboot if necessary (never needed to yet)
- restart buildbot:
- cd /build/slave (should already be there)
- DISPLAY=:2 buildbot start slave
2d. Verify that the slave is connected
- check the waterfall at: http://qm-rhel02.mozilla.org:2005/
- the slave sometimes takes a couple of minutes to reconnect
- if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button)
3. clobbering manually
- Sometimes a machine will need to be "clobbered" (have its objdir directory removed)
3a. Windows Remote Desktop (qm-win2k3-01, qm-winxp01)
- login to the machine.
- stop the slave (as above, ctrl-C in the command window)
- check the task manager to make sure there are no errant sh.exe or make.exe files.
- If there are, kill them (End process on sh.exe) or reboot the machine
- from the Command Line:
- cd C:\slave\trunk_2k3\mozilla (C:\slave\trunk\mozilla on winxp)
- rmdir /s /q objdir
- restart the slave
- cd \
- buildbot start slave (command does not return)
3b. Linux VNC (qm-centos5-01)
- login to the machine
- stop the slave (as above, buildbot stop slave)
- cd /build/slave/trunk_linux/mozilla
- rm -rf objdir
- restart the slave
- cd /build
- buildbot start slave
3c. Mac OS X VNC (qm-xserve01)
- login to the machine
- stop the slave (as above, buildbot stop slave)
- cd /builds/slave/trunk_osx/mozilla
- rm -rf objdir
- restart the slave
- cd /builds
- buildbot start slave
- message that buildbot didn't return is expected
4. Restarting the Farm
- In the worst case, the entire buildbot farm needs to be restarted
- shutdown each slave as per the instructions above
- Ctrl-C in Command window on Windows
- buildbot stop slave in Terminal on Mac and Linux
- shutdown master on qm-rhel02
- cd /build
- buildbot stop master
- reboot qm-rhel02 and slave machines if necessary (stuck processes, strange behavior)
- restart master on qm-rhel02
- cd /build
- buildbot start master
- restart slaves as above
- qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01
- verify waterfall at http://qm-rhel02.mozilla.org:2005/ is visible and slaves are connected