ReleaseEngineering/Win64EC2

From MozillaWiki
Jump to: navigation, search

Windows on AWS

High-Level Overview

Basic overview of updating the image:


What is an AMI image?

An AMI is a machine image which can be used as sort of a template to create new instances. When creating an AMI, you need to chose a base AMI on which to base your new AMI on. The base will often be Amazon-provided AMIs such as "Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12". Sometimes Amazon-provided AMIs are withdrawn and replaced with a newer image, hence the date on the image name.

How are AMIs configured?

AMI configuration is driven by configuration and script files that are kept under version control in http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs.

breakdown of ami_configs/tst-win64.json:

{
    "hostname": "tst-win64-ec2-%03d",   <--- hostname pattern
    <snip>...
    "us-west-2": {   <--- AWS region
        "type": "tst-win64",
        "instance_profile_name": "tst-win64",
        "domain": "test.releng.usw2.mozilla.com",
        "ami_desc": "Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12",   <--- AMI descriptive text
        "ami": "ami-bc92f08c",   <--- AMI ID
        "subnet_ids": ["subnet-a4cba8cd", "subnet-aecba8c7", "subnet-be89a2ca", "subnet-d6cba8bf"],
        "security_group_desc": ["default VPC security group", "windows slaves"],
        "security_group_ids": ["sg-d5617cb9", "sg-84beade6"],   <--- AWS firewall rules
        "instance_type": "m1.medium",   <--- virtual hardware type
        "distro": "win2012",
        "user_data_file": "ami_configs/tst-win64.user_data",   <--- PowerShell script filename
        "use_public_ip": true,   <--- yes, so we can route through the external firewall
        "device_map": {   <--- hard disk configuration
            "/dev/sda1": {
                "size": 30,
                "instance_dev": "C:"
            }
        }
    }
}

Windows configuration happens via ami_configs/tst-win64.user_data which is a Windows PowerShell script. First thing is how to get software downloaded onto the machine so it can be installed. Files are stored in the Amazon S3 (simple storage service) bucket 'mozilla-releng-tools' and fetched via this function:

# Fetch something from our S3 bucket; less verbose version than writing
# Out the Read-S3Object command. Always puts in current dir.
Function GetFromS3 ($obj) Template:Read-S3Object -BucketName mozilla-releng-tools -Key $obj -File $obj

Example usage:

### Install python
GetFromS3 python-2.7.5.msi
Log "Installing python"
Start-Process -Wait -FilePath "python-2.7.5.msi" -ArgumentList "/qn"
Log "Done"

The Windows PowerShell code such as above is run by a Windows service (provided by Amazon and part of the base AMI they provide) called EC2Config. It runs by default at first boot (but can be triggered to run as needed) on every Windows instance you create on Amazon. This service gives you the 'hook' you need to make all the configuration changes you'll need. The code is stored in the 'userdata' tag of the machine image (or instance). Thankfully, our automation takes care of copying the contents of the *.user_data file into this tag and making sure that the EC2Config service runs when it is needed.

Creating a Python virtualenv for testing

$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com
$ mkdir <username> && cd <username>
$ hg clone http://hg.mozilla.org/build/cloud-tools
$ cd cloud-tools
$ ln -s /builds/aws_manager/secrets .
$ virtualenv venv
$ source venv/bin/activate
$ patch -p1 << EOF
diff --git a/requirements.txt b/requirements.txt
--- a/requirements.txt
+++ b/requirements.txt
@@ -6,7 +6,7 @@ argparse==1.2.1
 boto==2.27.0
 docopt==0.6.1
 ecdsa==0.10
-invtool==0.1.0
+invtool==4.4.0
 iso8601==0.1.10
 paramiko==1.12.0
 pycrypto==2.6.1
EOF
$ pip install --find-links http://puppetagain.pub.build.mozilla.org/data/python/packages/ -r requirements.txt

Create rbt-w64-ec2-XXX AMI

Base image: Windows_Server-2012-RTM-English-64Bit-Base
AMI configuration file: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/rbt-win64.json
PowerShell: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/rbt-win64.user_data
$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com
$ source /builds/aws_manager/bin/activate
$ cd /builds/aws_manager/cloud-tools/
# XXX - add a way to pass this in via a 'secrets' file
# replace the {password} occurrences with an actual password
$ vim ami_configs/rbt-win64.user_data
$ python scripts/aws_create_win_ami.py -c rbt-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-rbt-w64-ec2-000
# record the new AMI ID displayed in the above command output
$ vim configs/rbt-win64

Note: you'll need to do this for each AWS region in the rbt-win64.json file.

Create rtb-w64-ec2-XXX Instance

bug=<bug#>
user=<loan_username>
email="$user@mozilla.com"
slavetype=rbt-w64-ec2
host=$slavetype-$user
ip=`python scripts/free_ips.py -c configs/tst-win64 -r us-west-2 -n1`
invtool A create --ip $ip --fqdn $host.dev.releng.usw2.mozilla.com --private  --description "bug $bug: loaner for $user"
invtool PTR create --ip $ip --target $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user"
sleep 20m
python scripts/aws_create_instance.py -c configs/tst-win64 -r us-west-2 -s aws-releng -k secrets/aws-secrets.json \
 -i instance_data/us-east-1.instance_data_dev.json $host

TODO: the instance created uses a different IP address than what's in DNS - this needs to be fixed.

Create tst-w64-ec2-XXX AMI

Base image: Windows_Server-2012-RTM-English-64Bit-Base
AMI configuration file: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/tst-win64.json
PowerShell: http://hg.mozilla.org/build/cloud-tools/file/default/ami_configs/tst-win64.user_data
$ ssh buildduty@aws-manager1.srv.releng.scl3.mozilla.com
$ source /builds/aws_manager/bin/activate
$ cd /builds/aws_manager/cloud-tools/
# XXX - add a way to pass this in via a 'secrets' file
# replace the {password} occurrences with an actual password
$ vim ami_configs/tst-win64.user_data
$ python scripts/aws_create_win_ami.py -c tst-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-tst-w64-ec2-000
# record the new AMI ID displayed in the above command output
$ vim configs/tst-win64

Note: you'll need to do this for each AWS region in the tst-win64.json file.


Create tst-w64-ec2-XXX Instance

bug=<bug#>
user=<loan_username>
email="$user@mozilla.com"
slavetype=tst-w64-ec2
host=$slavetype-$user
ip=`python scripts/free_ips.py -c configs/tst-win64 -r us-west-2 -n1`
invtool A create --ip $ip --fqdn $host.dev.releng.usw2.mozilla.com --private  --description "bug $bug: loaner for $user"
invtool PTR create --ip $ip --target $host.dev.releng.usw2.mozilla.com --private --description "bug $bug: loaner for $user"
sleep 20m
python scripts/aws_create_instance.py -c configs/tst-win64 -r us-west-2 -s aws-releng -k secrets/aws-secrets.json \
 -i instance_data/us-east-1.instance_data_dev.json $host


Troubleshooting

Test Builds Not Starting

Are any of these instances running?

'win64_vm': dict([('tst-w64-ec2-%03i' % x, {}) for x in range(100)]),

They were originally created in us-east-1 and I (jhopkins) added a us-west-2 configuration. The original instances appear to have been terminated. I will spin up some new ones.

Not authorized for images: [ami-173d747e]

Means the AMI no longer exists. You'll need to find a new base AMI (eg. same AMI name as the last one but with an updated 'date' part of the name) to base your custom AMI on top of. When searching for an updated AMI, be sure to set your Filter criteria in the AWS web console to "Public images".

Example: Windows_Server-2012-RTM-English-64Bit-Base-2014.02.12 becomes Windows_Server-2012-RTM-English-64Bit-Base-2014.03.12

InsufficientFreeAddressesInSubnet

Try us-west-2 instead of us-east-1

$ python aws_create_win_ami.py -c tst-win64 -s aws-releng -k secrets/aws-secrets.json -r us-west-2 dev-tst-w64-ec2-000

Takes more than 30-40 minutes to create the AMI

RDP in to the instance and investigate for problems.

Other Howtos

Login to the AWS Console?

https://mozilla-releng.signin.aws.amazon.com/console

Make sure you choose the correct region (US-East-1, US-West-2). You'll need to have authentication set up already. Talk to RelEng otherwise.


Access Instance Log Files

See C:\Program Files\EC2Config\logs on the instance.

Verify Graphics Capabilities

"To verify that things are successful, have the "real" cltbld user also start firefox.exe as part of its login/startup script.  When you connect to it via RDP after startup (to check), Firefox should be running, and all the HW accel bits in about:support should show as enabled (D2D/DWrite should be true, GPU Accel Windows should be D3D10 1/1, WebGL should have a renderer; the Adapter should be RDPUDD Chained DD or something like that)."


  • When connected via Microsoft RDP from my MacBook Pro
Adapter Description: RDPUDD Chained DD
Adapter Drivers: RDPUDD
Adapter RAM: Unknown
Device ID: 0xfefe
Direct2D Enabled: true
DirectWrite Enabled: true (6.2.9200.16581)
Driver Date: 01-01-1970
Driver Version: 6.2.9200.16434
GPU #2 Active: false
GPU Accelerated Windows: 1/1 Direct3D 10
Vendor ID: 0x1414
WebGL Renderer: Google Inc. -- ANGLE (Microsoft Basic Render Driver Direct3D9Ex vs_3_0 ps_3_0)
windowLayerManagerRemote: false
AzureCanvasBackend: direct2d
AzureContentBackend: direct2d
AzureFallbackCanvasBackend: cairo
AzureSkiaAccelerated: 0


RDP to Instance

  • Select the instance in the Amazon EC2 console
  • Click "connect" at the top of the screen
  • Get the password (paste the aws-releng SSH private key when prompted)