Buildbot/IT Talos Support Document
< Buildbot
Jump to navigation
Jump to search
Machines
master:
- qm-buildbot01
slaves:
- qm-pxp01-05
- qm-mini*
see also: Buildbot/Talos/Machines
list of steps to try
1. first check
- check waterfall at: http://qm-buildbot01.mozilla.org:2004/ (mpt-vpn)
- see if slave is connected.
2. second check - restart slave
- login to machine using provided credentials
- VNC (qm-pxp01-05, qm-mini*):
- close running instances of firefox or dialog windows (make sure to check the taskbar)
- on windows, ctrl-C in the command window, answer Yes to terminate buildbot process
- ctrl-c in the command window, answer yes to terminate buildbot process
- cd c:\
- on qm-pxp*, 'buildbot start slave' (command does not return)
- on qm-mini*, 'buildbot start talos-slave' (command does not return)
- on linux/mac
- login via ssh
- 'buildbot stop talos-slave' (ignore 'never saw slave...' message on mac)
- 'buildbot start talos-slave' (ignore 'never saw slave...' message on mac)
- verify slave reappears on buildbot waterfall page
note builds are triggered by finished builds on the Tinderbox (Firefox for trunk, Mozilla1.8 for branch). Then, depending on when the master was started, may take up to 10 minutes to recognize a change. If the master is restarted, first completed tinderbox builds are often missed so sometimes it can take upwards of 30-40 minutes to verify that systems are working as expected.
3. Restarting the Master
- In the worst case, the entire buildbot farm needs to be restarted
- shutdown each slave as per the instructions above
- Ctrl-C in Command window on Windows
- buildbot stop slave in Terminal on Mac and Linux
- shutdown master on qm-buildbot01
- cd /build
- buildbot stop perfmaster
- reboot qm-buildbot01 and slave machines if necessary (stuck processes, strange behavior)
- restart master on qm-buildbot01
- cd /build
- buildbot start perfmaster
- restart slaves as above
- verify waterfall at http://qm-buildbot01.mozilla.org:2004/ is visible and slaves are connected