ReleaseEngineering/How To/Request a slave
- 1 What type of slave do I need?
- 2 What happens next?
- 3 Accessing your slave
- 4 Returning your slave
- 5 Tips & Tricks
It is possible for developers to borrow build or test slaves to investigate failures that you cannot reproduce locally. Slave can be requested by filing a bug in the Loan Requests component in Bugzilla.
What type of slave do I need?
If you're looking at a result on Treeherder, the pop-up frame at the bottom that contains the job summary and logs also contains the name of the slave that ran the job, e.g.:
Machine name: tst-linux32-spot-1024
To make requesting the correct type of slave easier, here's a mapping of platforms/slave types to bugzilla request templates for that type of slave:
- Build/Try slaves
- Test slaves
What happens next?
If your slave request bug has sat without action for more than a day, please ping the person on buildduty in the #releng channel. Once the person on buildduty notices your slave request bug, they will perform the necessary actions actions to procure you a slave of the chosen type.
When the slave is ready for you, the original request bug will be assigned back to *YOU*, the developer, for the duration of your use of the slave. The loan request bug will block a bug for the specific slave you have borrowed.
Accessing your slave
When the slave is ready for you, the person on buildduty will contact you out-of-band (usually via email) to give you login crendentials and any specific notes for the loan.
To access all slaves, you require access to the Mozilla VPN, and as such, require Mozilla LDAP credentials. Buildduty will work with IT to get the VPN access setup prior to handing over your requested slave.
Details for accessing the VPN can be found here: https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=30769829
ssh vs. VNC
All slaves *should* be accessible via ssh and VNC. However, VNC access is required to run tests in the same graphical context as run by buildbot (the releng continuous integration framework). Kicking off tests via ssh will give unpredictable (read: wrong) results.
Best practice is to start tests via VNC, and then examine logs, etc via ssh.
Mac can be funky to VNC into. Read the "Tips & Tricks" section below.
Caveat: if you are specifically concerned about performance numbers (i.e. talos), you should avoid touching the slave at all once the test has begun, i.e. don't be connected via ssh running tail on a log while the test is running, etc.
How do I recreate this failure?
These days, most jobs are run via mozharness scripts which live in-tree alongside their configs. This makes it easier to replicate a given build or test failure by hand.
If you're trying to debug a failure that isn't in mozharness, the log from treeherder will include all of the steps and environment variables you need to setup to replicate the output of a given task.
Can I install X? - root access
Yes, you can.
If there's a debugging tool or something that makes your life easier while working with the slave, feel free to install it. In most cases, you should have root/Administrator/sudo access to install things as required. We re-image all loaned slaves when they come back from developers, so changes will be lost and your machine will return to a known state.
To have root access make sure that you ssh directly as root rather than trying to switch users.
Caveat #1: If you're trying to debug a failure that is happening in buildbot, try to replicate the failure *before* installing extra software of upgrading packages. Every package or configuration option you change makes your results less applicable to the current production environment.
Caveat #2: If you're working with a slave and *do* need to install software, drivers, service packs, etc. in order to make a build or test work, please keep detailed notes. If these are packages that will need to be deployed to all slaves of that type, your detailed notes will make life much easier for releng and relops.
Returning your slave
When you are finished with the slave, please comment to that affect in the request bug and resolve it as FIXED. This is the cue to releng that we can recover that slave.
Pinging the person on buildduty in #releng is appreciated as well.
Tips & Tricks
Slave type-specific notes
Connecting via adb
- once you have the ip/name of the device (i.e. panda-0314), you can telnet to it via [sut]:
telnet panda-0314 20701
- on the device type 'adb ip'
- on your local computer, type 'adb connect panda-0314'
- now you can use adb to install, logcat, or do other fun stuff
- note: if you restart the device you need to access it via sutagent and 'adb ip' again.
Mac OS X
To VNC into the Mac machines can be rather difficult. Try Armen's Lion instructions, and if that doesn't work ask #releng to help you.
Linux test machine
You'll need a newer compiler. If you choose gcc-4.8, you'll need a newer gdb or compile asking for a older version of debug information