ReleaseEngineering/Win64EC2
Windows on AWS
High-Level Overview
Basic overview of updating the image:
- start with a bare AMI
- process is managed by a script called aws_create_win_ami.py
- powershell script runs via the ec2config service (see http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/UsingConfig_WinAMI.html)
- path to script is specified in the instance's userdata
- shuts down at the end
- mgmt script waits then clears the userdata and creates a new AMI
What is an AMI image?
An AMI is a machine image which can be used as sort of a template to create new instances. When creating an AMI, you need to chose a base AMI on which to base your new AMI on. The base will often be Amazon-provided AMIs such as "Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12". Sometimes Amazon-provided AMIs are withdrawn and replaced with a newer image, hence the date on the image name.
How are AMIs configured?
AMI configuration is driven by configuration and script files that are kept under version control in http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs.
breakdown of ami_configs/tst-win64.json:
{ "hostname": "tst-win64-ec2-%03d", <--- hostname pattern <snip>... "us-west-2": { <--- AWS region "type": "tst-win64", "instance_profile_name": "tst-win64", "domain": "test.releng.usw2.mozilla.com", "ami_desc": "Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12", <--- AMI descriptive text "ami": "ami-bc92f08c", <--- AMI ID "subnet_ids": ["subnet-a4cba8cd", "subnet-aecba8c7", "subnet-be89a2ca", "subnet-d6cba8bf"], "security_group_desc": ["default VPC security group", "windows slaves"], "security_group_ids": ["sg-d5617cb9", "sg-84beade6"], <--- AWS firewall rules "instance_type": "m1.medium", <--- virtual hardware type "distro": "win2012", "user_data_file": "ami_configs/tst-win64.user_data", <--- PowerShell script filename "use_public_ip": true, <--- yes, so we can route through the external firewall "device_map": { <--- hard disk configuration "/dev/sda1": { "size": 30, "instance_dev": "C:" } } } }
Windows configuration happens via ami_configs/tst-win64.user_data which is a Windows PowerShell script. First thing is how to get software downloaded onto the machine so it can be installed. Files are stored in the Amazon S3 (simple storage service) bucket 'mozilla-releng-tools' and fetched via this function:
# Fetch something from our S3 bucket; less verbose version than writing # Out the Read-S3Object command. Always puts in current dir. Function GetFromS3 ($obj) Template:Read-S3Object -BucketName mozilla-releng-tools -Key $obj -File $obj
Example usage:
### Install python GetFromS3 python-2.7.5.msi Log "Installing python" Start-Process -Wait -FilePath "python-2.7.5.msi" -ArgumentList "/qn" Log "Done"
The Windows PowerShell code such as above is run by a Windows service (provided by Amazon and part of the base AMI they provide) called EC2Config. It runs by default at first boot (but can be triggered to run as needed) on every Windows instance you create on Amazon. This service gives you the 'hook' you need to make all the configuration changes you'll need. The code is stored in the 'userdata' tag of the machine image (or instance). Thankfully, our automation takes care of copying the contents of the *.user_data file into this tag and making sure that the EC2Config service runs when it is needed.
Creating a Python virtualenv for testing
$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com $ mkdir <username> && cd <username> $ hg clone http://hg.mozilla.org/build/cloud-tools $ cd cloud-tools $ ln -s /builds/aws_manager/secrets . $ virtualenv venv $ source venv/bin/activate $ patch -p1 << EOF diff --git a/requirements.txt b/requirements.txt --- a/requirements.txt +++ b/requirements.txt @@ -6,7 +6,7 @@ argparse==1.2.1 boto==2.27.0 docopt==0.6.1 ecdsa==0.10 -invtool==0.1.0 +invtool==4.4.0 iso8601==0.1.10 paramiko==1.12.0 pycrypto==2.6.1 EOF $ pip install --find-links http://puppetagain.pub.build.mozilla.org/data/python/packages/ -r requirements.txt
Create rbt-w64-ec2-XXX AMI
Base image: Windows_Server-2012-RTM-English-64Bit-Base AMI configuration file: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/rbt-win64.json PowerShell: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/rbt-win64.user_data
$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com $ source /builds/aws_manager/bin/activate $ cd /builds/aws_manager/cloud-tools/ # XXX - add a way to pass this in via a 'secrets' file # replace the {password} occurrences with an actual password $ vim ami_configs/rbt-win64.user_data $ python scripts/aws_create_win_ami.py -c rbt-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-rbt-w64-ec2-000 # record the new AMI ID displayed in the above command output $ vim configs/rbt-win64
Note: you'll need to do this for each AWS region in the rbt-win64.json file.
Create rtb-w64-ec2-XXX Instance
bug=<bug#> user=<loan_username> email="$user@mozilla.com" slavetype=rbt-w64-ec2 host=$slavetype-$user ip=`python scripts/free_ips.py -c configs/tst-win64 -r us-west-2 -n1` invtool A create --ip $ip --fqdn $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user" invtool PTR create --ip $ip --target $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user" sleep 20m python scripts/aws_create_instance.py -c configs/tst-win64 -r us-west-2 -s aws-releng -k secrets/aws-secrets.json \ -i instance_data/us-east-1.instance_data_dev.json $host
TODO: the instance created uses a different IP address than what's in DNS - this needs to be fixed.
Create tst-w64-ec2-XXX AMI
Base image: Windows_Server-2012-RTM-English-64Bit-Base AMI configuration file: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/tst-win64.json PowerShell: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/tst-win64.user_data
$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com $ source /builds/aws_manager/bin/activate $ cd /builds/aws_manager/cloud-tools/ # XXX - add a way to pass this in via a 'secrets' file # replace the {password} occurrences with an actual password $ vim ami_configs/tst-win64.user_data $ python scripts/aws_create_win_ami.py -c tst-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-tst-w64-ec2-000 # record the new AMI ID displayed in the above command output $ vim configs/tst-win64
Note: you'll need to do this for each AWS region in the tst-win64.json file.
Create tst-w64-ec2-XXX Instance
bug=<bug#> user=<loan_username> email="$user@mozilla.com" slavetype=tst-w64-ec2 host=$slavetype-$user ip=`python scripts/free_ips.py -c configs/tst-win64 -r us-west-2 -n1` invtool A create --ip $ip --fqdn $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user" invtool PTR create --ip $ip --target $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user" sleep 20m python scripts/aws_create_instance.py -c configs/tst-win64 -r us-west-2 -s aws-releng -k secrets/aws-secrets.json \ -i instance_data/us-east-1.instance_data_dev.json $host
Troubleshooting
Test Builds Not Starting
Are any of these instances running?
'win64_vm': dict([('tst-w64-ec2-%03i' % x, {}) for x in range(100)]),
They were originally created in us-east-1 and I (jhopkins) added a us-west-2 configuration. The original instances appear to have been terminated. I will spin up some new ones.
Not authorized for images: [ami-173d747e]
Means the AMI no longer exists. You'll need to find a new base AMI (eg. same AMI name as the last one but with an updated 'date' part of the name) to base your custom AMI on top of. When searching for an updated AMI, be sure to set your Filter criteria in the AWS web console to "Public images".
Example: Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12 becomes Windows_Server-2012-RTM-English-64Bit-Base-2014.03.12
InsufficientFreeAddressesInSubnet
Try us-west-2 instead of us-east-1
$ python aws_create_win_ami.py -c tst-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-tst-w64-ec2-000
Takes more than 30-40 minutes to create the AMI
RDP in to the instance and investigate for problems.
Other Howtos
Login to the AWS Console?
https://mozilla-releng.signin.aws.amazon.com/console
Make sure you choose the correct region (US-East-1, US-West-2). You'll need to have authentication set up already. Talk to RelEng otherwise.
Access Instance Log Files
See C:\Program Files\EC2Config\logs on the instance.
Verify Graphics Capabilities
- Verify that graphics acceleration bits are enabled
- type about:support in firefox
- https://bugzilla.mozilla.org/show_bug.cgi?id=901051#c1
"To verify that things are successful, have the "real" cltbld user also start firefox.exe as part of its login/startup script. When you connect to it via RDP after startup (to check), Firefox should be running, and all the HW accel bits in about:support should show as enabled (D2D/DWrite should be true, GPU Accel Windows should be D3D10 1/1, WebGL should have a renderer; the Adapter should be RDPUDD Chained DD or something like that)."
- When connected via Microsoft RDP from my MacBook Pro
Adapter Description: RDPUDD Chained DD Adapter Drivers: RDPUDD Adapter RAM: Unknown Device ID: 0xfefe Direct2D Enabled: true DirectWrite Enabled: true (6.2.9200.16581) Driver Date: 01-01-1970 Driver Version: 6.2.9200.16434 GPU #2 Active: false GPU Accelerated Windows: 1/1 Direct3D 10 Vendor ID: 0x1414 WebGL Renderer: Google Inc. -- ANGLE (Microsoft Basic Render Driver Direct3D9Ex vs_3_0 ps_3_0) windowLayerManagerRemote: false AzureCanvasBackend: direct2d AzureContentBackend: direct2d AzureFallbackCanvasBackend: cairo AzureSkiaAccelerated: 0
RDP to Instance
- Select the instance in the Amazon EC2 console
- Click "connect" at the top of the screen
- Get the password (paste the aws-releng SSH private key when prompted)