|
|
| Line 1: |
Line 1: |
| {{Release Engineering How To|Set up a new AWS slave class}}
| |
|
| |
|
| =How to Set up a New AWS Slave Class=
| |
|
| |
| We currently run most of Linux based tests on AWS. AWS has many [http://aws.amazon.com/ec2/instance-types/ instance types]. We use spot instances because they are cheaper than on-demand. The default instance type we run most of our Linux based tests is m1.medium. It is low cost but is not powerful enough for some of our tests, like crashtests and reftests on Android emulators, or media or gaia-ui tests on B2G. So how do you add a new slave class for so that different tests can run within different instance types within the for the same platform. For example this how the slave platforms are defined for Android 2.3:
| |
|
| |
| <pre>
| |
| PLATFORMS['android']['ubuntu64_vm_mobile'] = {
| |
| 'name': "Android 2.3 Emulator",
| |
| }
| |
| PLATFORMS['android']['ubuntu64_vm_large'] = {
| |
| 'name': "Android 2.3 Emulator",
| |
| }
| |
| </pre>
| |
|
| |
| This example will focus on the adding tst-emulator64-spot instance slave type because this is the one I recently added in [https://bugzilla.mozilla.org/show_bug.cgi?id=1034055 bug 1031083: implement c3.xlarge slave class for Linux64 test spot instances]. The repo you'll need to work with are [http://hg.mozilla.org/build/pupppet/ puppet] , [http://hg.mozilla.org/build/cloud-tools/ cloud tools] and [http://hg.mozilla.org/build/buildbot-configs/ buildbot-configs]. You'll also need to work with the [http://hg.mozilla.org/build/tools/ tools] repo if you need to enable new buildbot masters as part of this change.
| |
|
| |
| ==Create new AMI==
| |
|
| |
| * To do: describe how to create an AMI from scratch, I just reused an existing AMI
| |
|
| |
| The configs for each AMI are stored in [http://hg.mozilla.org/build/cloud-tools/ cloud tools]. For the tst-emulator64-spot AMI I created, I reused the existing [http://hg.mozilla.org/build/cloud-tools/file/42995d055d41/configs/tst-linux64 tst-linux64] AMI id and copied it into the [http://hg.mozilla.org/build/cloud-tools/file/42995d055d41/configs/tst-emulator64 new configs I created]. However, I changed the instance type to be c3.xlarge instead of m1.medium and also had to tweak the subnets because this instance type is not available in some AWS regions. You'll note that the configs define the AMI in two regions - us-east1 and us-west2.
| |
|
| |
| ==Update Cloud Tools with new platform==
| |
|
| |
| I also updated cloud-tools so this new AMI would be listed as a platform. You can see [http://hg.mozilla.org/build/cloud-tools/rev/2bd91d08ce88 the changes required here] and [http://hg.mozilla.org/build/cloud-tools/rev/8ce35c032dd4 here].
| |
|
| |
| ==Create golden AMI image==
| |
|
| |
| This config was then added to puppet so the AMI should have been created automatically the next time it ran. Also, we had to use invtool to add A and PTR records into DNS for the golden image. The cron job that creates the golden images is [http://hg.mozilla.org/build/puppet/file/cdd03eb55435/modules/aws_manager/manifests/cron.pp#l115 stored in Puppet]. You can login to the [https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#Environment_Setup aws manager machine] to run the cron jobs manually and speed things up.
| |
|
| |
| ==Loan yourself slave and test on dev-master==
| |
|
| |
| Once you have the AMI image up and running, you can [https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#Test_machines loan yourself a slave] of that type and try to run tests on it on your dev-master. You'll have to change the config in the loan document to match your new slave type, i.e. instead of cloud-tools/configs/tst-linux$arch it should be cloud-tools/configs/tst-emulator64 to follow my example.
| |
|
| |
| ==Write patches to enable the new platform in buildbot and Puppet==
| |
|
| |
| Example of a change to add [http://hg.mozilla.org/build/puppet/rev/228d5d1b24e5 a new slave class to puppet]. In this example I added two new puppet slave classes despite them being mapped to the same AWS instance type. This is because of duplicate builder issues in buildbot. The name of the platforms (ubuntu64_vm_armv6_large and ubuntu64_vm_large) is included in the builder directory and thus if you have crashtests running on both platforms you'll run into a duplicate builder issue.
| |
|
| |
| Here's an example of the buildbot changes required to [http://hg.mozilla.org/build/buildbot-configs/rev/daf28bc3bdf0 enable this new slave platform on ash]
| |
|
| |
| ==Update Slavealloc db==
| |
|
| |
| This also requires adding the names of the new slaves (and masters if applicable) to slavealloc. [https://bug1034055.bugzilla.mozilla.org/attachment.cgi?id=8453275 An example here]. To add slaves to slavealloc you can use the [https://wiki.mozilla.org/ReleaseEngineering/Buildduty/Slave_Management#Adding_a_slave dbimport tool as described here].
| |
|
| |
| ==Add new buildbot masters if required==
| |
|
| |
| Add new masters to handle the load from the slaves is required. https://wiki.mozilla.org/ReleaseEngineering/AWS_Master_Setup. A good example of the changes required is [https://bugzilla.mozilla.org/show_bug.cgi?id=1035863 bug 1035863]. As the doc states, lock the master to some slaves and verify that the jobs run green before enabling the new masters in production. Then [https://bug1035863.bugzilla.mozilla.org/attachment.cgi?id=8454449 enable the master in tools] and reconfig enable the new buildbot masters and buildbot-config changes.
| |
|
| |
| ==Write patches so watch_pending.cfg will allocate slaves to this pool ==
| |
|
| |
| At first, [http://hg.mozilla.org/build/cloud-tools/rev/aa9f8f58f2dd I tested this on ash] so the regexp was only matched certain tests on ash.
| |
|
| |
| Ensure tests run green on this branch.
| |
|
| |
| ==Enable builders on other relevant branches==
| |
|
| |
| By landing patches in [http://hg.mozilla.org/build/buildbot-configs/rev/ee8c305143ba buildbot configs], running a reconfig and then [http://hg.mozilla.org/build/cloud-tools/rev/7043f6e2457f adjusting the regexp in watch_pending.py] to allocate slaves to all branches for these tests.
| |
|
| |
| ==Adjust size of slave pool if pending counts are high==
| |
|
| |
| Update [https://bug1034055.bugzilla.mozilla.org/attachment.cgi?id=8456351 buildbot-configs/mozilla-tests/production_config.py] and add new slaves to slavealloc] and reconfig.
| |
|
| |
| Close bug!
| |
|
| |
| ==Video presentation of this topic==
| |
|
| |
| https://wiki.mozilla.org/ReleaseEngineering/Blackbox_Sessions/08-15-2014
| |