Buildbot/IT Unittest Support Document: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Page is now obsolete)
 
(24 intermediate revisions by 8 users not shown)
Line 1: Line 1:
== Machines ==
'''This document is obsolete -- please refer to [[ReleaseEngineering:ITSupport]] for your IT Support needs'''
 
master:
* qm-rhel02
 
slaves:
* qm-centos5-01
* qm-xserve01
* qm-winxp01
* qm-win2k3-01
 
== list of steps to try ==
 
1. first check
* check waterfall at: http://qm-rhel02:2005/ (mpt-vpn)
* see if slave is connected.
* if so, click the machine name link, try a "Force Build"
** fill out the name and reason fields, click the button
 
2. second check - restart slave
* login to machine using provided credentials
 
2a. Windows Remote Desktop (qm-winxp01, qm-win2k3-01):
* on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
* check the Task Manager to see if there are any stuck sh.exe and make.exe processes
** if so, reboot the machine using whatever means necessary (the stuck sh.exe and make.exe processes can make shutting down tricky. Kill them if you can)
** when machine is rebooted, log back in and open a command prompt
** restart buildbot:
*** cd c:\
*** buildbot start slave (command does not return)
 
2b. Mac OS X VNC (qm-xserve01):
* in the Terminal window, type "buildbot stop slave"
** note: message about not returning is expected on Mac
* if necessary, reboot machine (if it appears to be responding strangely. I think we've only rebooted it once or twice since incept)
* in a Terminal window,
** cd /builds/
** buildbot start slave
 
2c. Linux VNC (qm-centos5-01):
* in the Terminal, type "buildbot stop slave"
* reboot if necessary (never needed to yet)
* restart buildbot:
** cd /build/slave (should already be there)
** buildbot start slave
 
2d. Verify that the slave is connected
* check the waterfall at: http://qm-rhel02:2005/
* the slave sometimes takes a couple of minutes to reconnect
* if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button)
 
3. clobbering manually
* Sometimes a machine will need to be "clobbered" (have its objdir directory removed)
 
3a. Windows Remote Desktop (qm-win2k3-01, qm-winxp01)
* login to the machine.
* stop the slave (as above, ctrl-C in the command window)
* check the task manager to make sure there are no errant sh.exe or make.exe files.
** If there are, kill them (End process on sh.exe) or reboot the machine
* from the Command Line:
** cd C:\slave\trunk_2k3\mozilla (C:\slave\trunk\mozilla on winxp)
** rd /s /q objdir
* restart the slave
** cd \
** buildbot start slave (command does not return)
 
3b. Linux VNC (qm-centos5-01)
* login to the machine
* stop the slave (as above, buildbot stop slave)
** cd /build/slave/trunk_linux/mozilla
** rm -rf objdir
* restart the slave
** cd /build
** buildbot start slave
 
3c. Mac OS X VNC (qm-xserve01)
* login to the machine
* stop the slave (as above, buildbot stop slave)
** cd /builds/slave/trunk_osx/mozilla
** rm -rf objdir
* restart the slave
** cd /builds
** buildbot start slave
*** message that buildbot didn't return is expected
 
4. Restarting the Farm
* In the worst case, the entire buildbot farm needs to be restarted
* shutdown each slave as per the instructions above
** Ctrl-C in Command window on Windows
** buildbot stop slave in Terminal on Mac and Linux
* shutdown master on qm-rhel02
** cd /build
** buildbot stop master
* reboot qm-rhel02 and slave machines if necessary (stuck processes, strange behavior)
* restart master on qm-rhel02
** cd /build
** buildbot start master
* restart slaves as above
** qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01
* verify waterfall at http://qm-rhel02:2005/ is visible and slaves are connected

Latest revision as of 18:44, 27 October 2008

This document is obsolete -- please refer to ReleaseEngineering:ITSupport for your IT Support needs