Auto-tools/Projects/EC2Builder

From MozillaWiki
Jump to: navigation, search

EC2 Build Server =

The EC2 Build server seeks to allow building mozilla-central's different changesets quickly with the help of Amazon's Elastic Cloud service. The tools that will use EC2's build server potentially include mozregression and mozremotebuilder

Cost Analysis

Preliminary tests demonstrate that build times are as follows:

EC2 Micro Linux Instance / Linux Firefox Build: > 12 hrs
EC2 Micro Windows Instance / Windows Firefox Build: > 12 hrs

EC2 Small Linux Instance / Linux Firefox Build: ??? hrs
EC2 Small Windows Instance / Windows Firefox Build: ??? hrs

EC2 Medium High-CPU Instance / Linux Firefox Build (1 Core): 1.2 hrs
EC2 Medium High-CPU Instance / Windows Firefox Build (1 Core):  2.75 hrs

Instance costs are as follows:

Micro Linux: $0.02 / hr
Micro Windows: $0.03 / hr

Small Linux: $0.085 / hr
Small Windows: $0.12 / hr

Medium High-CPU Linux: $0.17 / hr
Medium High-CPU Windows: $0.29 / hr

Worst-case (24-hour always-on) cost assuming limit of 5 instances running at once: $1162.35 / month

Storage and bandwidth costs haven't yet been calculated.

The cost of EC2 has been found to be untenable at this current time and this project has been discontinued. Please see mozremotebuilder for the alternative that is being worked on right now.

About EC2

EC2 is Amazon's cloud computing service. Most commonly used for site hosting, there is a separate service called Elastic Load Balancer that provides hooks for spawning new servers under load

E.g

CPU > x% would trigger new virtual server
CPU < x% would turn off virtual certain servers

Since we're using it to do builds, we thought about maybe using elastic load balancer (builds usually put the system at CPU close to 100%) but it seemed more effective to spin up 1 server per build and to queue excess jobs (limit active servers to some number, like 5, for cost reasons). Amazon provides a Queuing service as well.

EBS-backed Instances vs S3-backed Instances

When AWS first came out, EC2 only supported S3-backed instances. What this means is that EC2 instances have no state -- they simply take a frozen virtual image called an AMI, boot it up, and mount a "drive" that's really some memory allocated in Amazon's simple storage service (S3). Mounting the S3 drive can take about 5 minutes (in reality I think it's even longer than that) because there's movement of the image from S3 storage to EC2's local memory. Also there's a 10gb size limit (the max bucket size for S3). This is impractical for Windows under most circumstances from what I've read.

Relatively recently, EBS (Elastic Block Storage) has also become an option. Instead of mounting a drive stored in S3, you can store the AMI in elastic block storage. This allows faster boot up, as well as an option to "stop" an image and preserve its state prior to stopping. There is also no size limit (or well, it's such a high limit that we don't really care) to the drive. However it is more expensive because instead of just paying hourly running cost + storage cost, we pay additional storage cost. We do save a bit of money on the AMI storage: we're only charged for the memory that our additional modifications to the AMI have made, whereas an S3 instance is charged for storing the whole AMI.

It wasn't super clear how much of a difference this made from the price perspective, but from a performance perspective EBS is clearly superior and AMI creation and everything related to that is significantly easier given that Amazon provides GUI tools and snapshot tools for AMI creation. As a result the AMIs that are prepared right now are EBS-backed and not S3-backed.

Preliminary Design

One server remains always-on (probably an in-house box of some sort). It contains the keypair necessary to access Amazon. A python script listens to mozilla pulse messages and simultaneously listens to requests on a socket port. A client program will send a changeset # to the socket server, and the server will in turn spin up an instance (using synchronization primitives to keep track of how many resources are in use), or it will queue the job if too many instances are already running. When a job completes, the server downloads the built binary from the EC2 instance, shuts down the instance (freeing the resource unless another queued job exists in which case it gets the job and executes it), and serves the binary via HTTP. It provides the download URL via a pulse message. The client program knows its build is complete when the pulse message comes through. Binaries older than 3 days are deleted via a cron job. 

fg0zN.gif

Side Project

Create a wrapper that pushes to mozillapulse

Other ideas

Create parameter that makes build servers clone try instead of moz-central and build try requests!

Proposed Usage

Straight up building a changeset with the server (also expose an API that allows other scripts to easily integrate cloudbuilder's functionality):

mozcloudbuilder -s cloudbuilder.server.somesite.org -p 9999 --changeset=xxxxxxxxxx   << (returns url to built binary)

Bisect, build, and run interactively via changelog:

mozremotebuilder -g 2011-06-15 -b 2011-06-18


Contact

  1. ateam channel (harth, ctalbert, samliu), or sliu@mozilla.org