ReleaseEngineering/How To/Set Up a New AWS Slave Class

From MozillaWiki
Jump to navigation Jump to search


work in progress

We currently run most of Linux based tests on AWS. AWS has many instance types. We use spot instances because they are cheaper than on-demand. The default instance type we run most of our Linux based tests is m1.medium. It is low cost but is not powerful enough for some of our tests, like crashtests and reftests on Android emulators, or media or gaia-ui tests on B2G. So how do you add a new slave class for so that different tests can run within different instance types within the for the same platform (i.e. Android 2.3 opt)?

This example will focus on the adding tst-emulator64-spot instance slave type because this is the one I recently added in bug 1031083: implement c3.xlarge slave class for Linux64 test spot instances. The repo you'll need to work with are puppet , cloud tools and buildbot-configs. You'll also need to work with the tools repo if you need to enable new buildbot masters as part of this change.

Create new AMI

  • To do: describe how to create an AMI from scratch, I just reused an exisiting AMI

The configs for each AMI are stored in cloud tools. For the tst-emulator64-spot AMI I created, I reused the existing tst-linux64 AMI id and copied it into the new configs I created. However, I changed the instance type to be c3.xlarge instead of m1.medium and also had to tweak the subnets because this instance type is not available in some AWS regions.

Update Cloud Tools with new platform

I also updated cloud-tools so this new AMI would be listed as a platform. You can see the changes required here and here.

Create golden AMI image

This config was then added to puppet so the AMI should have been created automatically the next time it ran. Also, we had to use invtool to add A and PTR records into DNS for the golden image. The cron job that creates the golden images is stored in Puppet. You can login to the aws manager machine to run the cron jobs manually and speed things up.

Loan yourself slave and test on dev-master

Once you have the AMI image up and running, you can loan yourself a slave of that type and try to run tests on it on your dev-master. You'll have to change the config in the loan document to match your new slave type, i.e. instead of cloud-tools/configs/tst-linux$arch it should be cloud-tools/configs/tst-emulator64 to follow my example.

Write patches to enable the new platform in buildbot and Puppet

Example of a change to add a new slave class to puppet. In this example I added two new puppet slave classes despite them being mapped to the same AWS instance type. This is because of duplicate builder issues in buildbot. The name of the platforms (ubuntu64_vm_armv6_large and ubuntu64_vm_large) is included in the builder directory and thus if you have crashtests running on both platforms you'll run into a duplicate builder issue.

Here's an example of the buildbot changes required to enable this new slave platform on ash

This also requires adding the names of the new slaves to slavealloc. An example here. To add slaves to slavealloc you can use the dbimport tool as described here.

Add new buildbot masters if required

Add new masters to handle the load from the slaves is required. https://wiki.mozilla.org/ReleaseEngineering/AWS_Master_Setup. A good example of the changes required is bug 1035863

Reconfig to enable the new buildbot masters and buildbot-configs changes.

Write patches so watch_pending.cfg will allocate slaves to this pool

At first, I tested this on ash so the regexp was only matched certain tests on ash.

Ensure tests run green on this branch.

Enable builders on other relevant branches

By landing patches in buildbot configs, running a reconfig and then adjusting the regexp in watch_pending.py.

Adjust size of slave pool if pending counts are high