Buildbot/IT Unittest Support Document: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
'''This document is being phased out -- please refer to [[ReleaseEngineering:ITSupport]] for your IT Support needs'''
== Machines ==
== Machines ==



Revision as of 17:16, 5 June 2008

This document is being phased out -- please refer to ReleaseEngineering:ITSupport for your IT Support needs

Machines

Production

Name Platform CVS Branch for Config Tree Support Tier*
qm-rhel02 (master) linux - Firefox 1
qm-centos5-01 linux Trunk Firefox 1
qm-centos5-02 linux Trunk Firefox 1
qm-centos5-03 linux Trunk Firefox 1
qm-xserve01 Mac OS X 10.4 Trunk Firefox 1
qm-win2k3-01 (master) Windows Server 2003 Trunk Firefox 1
qm-leak-centos5-01 linux Trunk MozillaTest X
qm-leak-tiger-01 Mac OS X 10.4 - MozillaTest X
qm-leak-winxp01 Windows XP - MozillaTest X
qm-leak-w23k-01 Windows Server 2003 - MozillaTest X

- * - tiers explained

Staging

master:

  • qm-unittest02

slaves:

  • qm-stage-centos5-01
  • qm-stage-osx-01
  • qm-stage-winxp01
  • qm-winxp01
  • qm-stage-win2k3-01

list of steps to try

If #1 doesn't work, try #2. If that doesn't work, try #3. If #3 doesn't work, contact bhearsum or robcee.

NOTE! If you connect to any of the Windows boxes with a RDP client set to 16-bit color mode, the reftests in /mozilla/modules/libpr0n/test/reftest/ will start failing, until the display is reset back to 24-bit mode (eg, by disconnecting and/or rebooting). The tests require the display to be in 24-bit color mode, but Windows will reset the display to whatever the depth of the last client to connect to it was. See bug 414720 for more history.


1. first check

  • check waterfall at: http://qm-rhel02.mozilla.org:2005/ (mpt-vpn)
  • see if slave is connected.
  • if so, click the machine name link, try a "Force Build"
    • fill out the name and reason fields, click the button

2. second check - restart slave

  • login to machine using provided credentials

2a. Windows Remote Desktop (qm-winxp01, qm-win2k3-01):

  • on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
  • check the Task Manager to see if there are any stuck sh.exe and make.exe processes
    • if so, reboot the machine using whatever means necessary (the stuck sh.exe and make.exe processes can make shutting down tricky. Kill them if you can)
    • when machine is rebooted, log back in and open a command prompt
    • restart buildbot:
      • cd c:\
      • buildbot start slave (command does not return)
      • minimize the cmd.exe window

2b. Mac OS X VNC (qm-xserve01):

  • in the Terminal window, type "buildbot stop slave"
    • note: message about not returning is expected on Mac
  • if necessary, reboot machine (if it appears to be responding strangely. I think we've only rebooted it once or twice since incept)
  • in a Terminal window,
    • cd /builds/
    • buildbot start slave

2c. Linux VNC (qm-centos5-01):

  • in the Terminal, type "buildbot stop slave" (pwd should be /builds/)
  • reboot if necessary (never needed to yet)
  • restart buildbot:
    • cd /builds/ (should already be there)
    • buildbot start slave

2d. Verify that the slave is connected

  • check the waterfall at: http://qm-rhel02.mozilla.org:2005/
  • the slave sometimes takes a couple of minutes to reconnect
  • if it does, and is necessary, click the machine name link and force a build as above (fill name and reason fields, click the button)

3. clobbering manually

  • Sometimes a machine will need to be "clobbered" (have its build directory removed inside the slave dir)

3a. Windows Remote Desktop (qm-win2k3-01, qm-winxp01)

  • login to the machine.
  • stop the slave (as above, ctrl-C in the command window)
  • check the task manager to make sure there are no errant sh.exe or make.exe files.
    • If there are, kill them (End process on sh.exe) or reboot the machine
  • from the Command Line:
    • cd C:\slave
    • rmdir /s /q trunk (if qm-winxp01, or rmdir /s /q trunk_2k3 for qm-win2k3-01)
  • restart the slave
    • cd \
    • buildbot start slave (command does not return)
    • minimize cmd.exe window

3b. Linux VNC (qm-centos5-01)

  • login to the machine
  • stop the slave (as above, buildbot stop slave)
    • cd /builds/slave/trunk_centos5_2/mozilla
    • rm -rf objdir
  • restart the slave
    • cd /builds
    • verify Xvfb is running in the other Xterm
      • if not, enter "Xvfb -screen 0 1280x1024x24 :2 &" in the second Xterm
    • ignore any metacity already running on display1.
    • if no metacity is running on display2, run:
      • DISPLAY=:2 metacity &
    • cd /builds
    • DISPLAY=:2 buildbot start slave

3c. Mac OS X VNC (qm-xserve01)

  • login to the machine
  • stop the slave (as above, buildbot stop slave)
    • cd /builds/slave/trunk_osx/mozilla
    • rm -rf objdir
  • restart the slave
    • cd /builds
    • buildbot start slave
      • message that buildbot didn't return is expected

4. Restarting the Farm

  • In the worst case, the entire buildbot farm needs to be restarted
  • shutdown each slave as per the instructions above
    • Ctrl-C in Command window on Windows
    • buildbot stop slave in Terminal on Mac and Linux
  • shutdown master on qm-rhel02
    • cd /build
    • buildbot stop master
  • reboot qm-rhel02 and slave machines if necessary (stuck processes, strange behavior)
  • restart master on qm-rhel02
    • cd /build
    • buildbot start master
  • restart slaves as above
    • qm-centos5-01, qm-xserve01, qm-winxp01, qm-win2k3-01
  • verify waterfall at http://qm-rhel02.mozilla.org:2005/ is visible and slaves are connected