Buildbot/IT Unittest Support Document: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 11: Line 11:


== list of steps to try ==
== list of steps to try ==
If #1 doesn't work, try #2. If that doesn't work, try #3. If #3 doesn't work, contact bhearsum or robcee.


1. first check
1. first check

Revision as of 02:24, 13 October 2007

Machines

master:

  • qm-rhel02

slaves:

  • qm-centos5-01
  • qm-xserve01
  • qm-winxp01
  • qm-win2k3-01

list of steps to try

If #1 doesn't work, try #2. If that doesn't work, try #3. If #3 doesn't work, contact bhearsum or robcee.

1. first check

  • check waterfall at: http://qm-rhel02.mozilla.org:2005/ (mpt-vpn)
  • see if slave is connected.
  • if so, click the machine name link, try a "Force Build"
    • fill out the name and reason fields, click the button

2. second check - restart slave

  • login to machine using provided credentials

2a. Windows Remote Desktop (qm-winxp01, qm-win2k3-01):

  • on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
  • check the Task Manager to see if there are any stuck sh.exe and make.exe processes
    • if so, reboot the machine using whatever means necessary (the stuck sh.exe and make.exe processes can make shutting down tricky. Kill them if you can)
    • when machine is rebooted, log back in and open a command prompt
    • restart buildbot:
      • cd c:\
      • buildbot start slave (command does not return)

2b. Mac OS X VNC (qm-xserve01):

  • in the Terminal window, type "buildbot stop slave"
    • note: message about not returning is expected on Mac
  • if necessary, reboot machine (if it appears to be responding strangely. I think we've only rebooted it once or twice since incept)
  • in a Terminal window,
    • cd /builds/
    • buildbot start slave

2c. Linux VNC (qm-centos5-01):

  • in the Terminal, type "buildbot stop slave"
  • reboot if necessary (never needed to yet)
  • restart buildbot:
    • cd /build/slave (should already be there)
    • DISPLAY=:2 buildbot start slave

2d. Verify that the slave is connected

  • check the waterfall at: http://qm-rhel02.mozilla.org:2005/
  • the slave sometimes takes a couple of minutes to reconnect
  • if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button)

3. clobbering manually

  • Sometimes a machine will need to be "clobbered" (have its objdir directory removed)

3a. Windows Remote Desktop (qm-win2k3-01, qm-winxp01)

  • login to the machine.
  • stop the slave (as above, ctrl-C in the command window)
  • check the task manager to make sure there are no errant sh.exe or make.exe files.
    • If there are, kill them (End process on sh.exe) or reboot the machine
  • from the Command Line:
    • cd C:\slave\trunk_2k3\mozilla (C:\slave\trunk\mozilla on winxp)
    • rmdir /s /q objdir
  • restart the slave
    • cd \
    • buildbot start slave (command does not return)

3b. Linux VNC (qm-centos5-01)

  • login to the machine
  • stop the slave (as above, buildbot stop slave)
    • cd /build/slave/trunk_linux/mozilla
    • rm -rf objdir
  • restart the slave
    • cd /build
    • buildbot start slave

3c. Mac OS X VNC (qm-xserve01)

  • login to the machine
  • stop the slave (as above, buildbot stop slave)
    • cd /builds/slave/trunk_osx/mozilla
    • rm -rf objdir
  • restart the slave
    • cd /builds
    • buildbot start slave
      • message that buildbot didn't return is expected

4. Restarting the Farm

  • In the worst case, the entire buildbot farm needs to be restarted
  • shutdown each slave as per the instructions above
    • Ctrl-C in Command window on Windows
    • buildbot stop slave in Terminal on Mac and Linux
  • shutdown master on qm-rhel02
    • cd /build
    • buildbot stop master
  • reboot qm-rhel02 and slave machines if necessary (stuck processes, strange behavior)
  • restart master on qm-rhel02
    • cd /build
    • buildbot start master
  • restart slaves as above
    • qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01
  • verify waterfall at http://qm-rhel02.mozilla.org:2005/ is visible and slaves are connected