ReleaseEngineering/Applications/Proxxy: Difference between revisions

Jump to navigation Jump to search
Development and production workflows
No edit summary
(Development and production workflows)
Line 1: Line 1:
Proxxy is a basic http cache used in each data center to reduce network transfers. It's essentially a docker container that runs nginx inside, and can cache requests locally to disk.
== Overview  ==


Source code is currently located here: https://github.com/mozilla/build-proxxy/
Proxxy is a basic HTTP cache used in each data center to reduce network transfers.
It's essentially an nginx instance pre-configured to work as a caching reverse proxy for whitelisted backend servers.


It is deployed in each region in Amazon inside the VPC. Each region has a single c3.8xlarge instance to handle the load. The instances use Elastic IPs so we can get the same IP address if there's a need to re-create the instances. The routing tables are configured so that proxxy requests files via the public network instead of the VPN connection.
[https://github.com/mozilla/build-proxxy Source code is available on GitHub].


DNS is configured so that *.proxxy.srv.releng.$REGION.mozilla.com is points to the proxxy instances. See https://inventory.mozilla.org/en-US/core/search/#q=proxxy
Clients request files explicitly from the proxxy rather than relying on transparent network proxies, or HTTP_PROXY environment settings.
 
Since the proxxy instances can be handling multiple endpoints, we prepend the hostname of the original url to the proxxy url.
The proxxy instances can be accessed by ssh'ing to their internal IP from inside the build network. Login us user 'ubuntu' using the proxxy ssh key in the private releng repo. Logs on the machines are under /mnt/proxxy/logs.
e.g. to fetch http://ftp.mozilla.org/foo/bar, the client would first check http://ftp.mozilla.org.proxxy.srv.releng.use1.mozilla.com.
 
Much of this logic is handled by mozharness' proxxy mixin.
Clients request files explicitly from the proxxy rather than relying on transparent network proxies, or HTTP_CACHE environment settings. Since the proxxy instances can be handling multiple endpoints, we prepend the hostname of the original url to the proxxy url. e.g. to fetch http://ftp.mozilla.org/foo/bar, the client would first check http://ftp.mozilla.org.proxxy.srv.releng.use1.mozilla.com. Much of this logic is handled by mozharness' proxxy mixin.
If the file retrieval fails via proxxy, a fallback mechanism requests the file directly from the origin server.


The reasons we chose to have such a setup, rather than a traditional proxy setup, include:
The reasons we chose to have such a setup, rather than a traditional proxy setup, include:
* (main reason): explicit is better than implicit - from the url we can see which cache we are hitting
* (main reason): explicit is better than implicit - from the URL we can see which cache we are hitting
* transparent proxies are hard to debug or see what's going on
* transparent proxies are hard to debug or see what's going on
* using http_proxy or env vars may not be obvious in logging
* using HTTP_PROXY or env vars may not be obvious in logging
* with traditional proxies it can be difficult to switch to use different backends, or offer multiple proxy instances
* with traditional proxies it can be difficult to switch to use different backends, or offer multiple proxy instances


If any authentication required, e.g. for pvtbuilds, then proxxy has those credentials baked into the AMI. Test clients on the local network can then request those files from proxxy without authentication.
== Development ==
 
In development environment proxxy is running inside a Vagrant VM.
The VM is provisioned using Ansible.
 
=== Requirements ===
 
* [http://docs.ansible.com/ Ansible 1.6+]
 
=== Workflow ===
 
Add the following lines to your `/etc/hosts`:
 
# proxxy
10.0.31.2      proxxy.dev
10.0.31.2      ftp.mozilla.org.proxxy.dev
 
Run `vagrant up`.
This will start a fresh Ubuntu 14.04 VM, provision it with nginx and generate proxxy config.
 
The main proxxy config template is here: `ansible/roles/proxxy/templates/nginx-vhosts.conf.j2`
 
The variables used in this template are here: `ansible/group_vars/all`
 
When you change the template or the variables, you should regenerate Nginx config to test the changes.
You can do this by running `vagrant provision`.
 
Visit http://ftp.mozilla.org.proxxy.dev to check that proxxy works.
 
== Production (AWS) ==
 
Production environment in AWS uses EC2 instances launched from pre-configured AMIs inside a VPC.
Each region has a single c3.8xlarge instance to handle the load.
 
AMIs are generated using [http://www.packer.io/ Packer] and [http://docs.ansible.com/ Ansible].
 
While proxxy instances have both public and private IPs assigned, incoming requests are restricted to only build, test and try machines (using EC2 security group rules).
 
DNS is configured so that <code>*.proxxy.srv.releng.$REGION.mozilla.com</code> is points to the proxxy instances.
See: https://inventory.mozilla.org/en-US/core/search/#q=proxxy
 
The proxxy instances can be accessed by SSH'ing to their internal IP from inside the build network.
Login as user 'ubuntu' using the proxxy SSH key in the private releng repo.
Logs on the machines are under <code>/mnt/proxxy/logs</code>.
 
If any authentication required, e.g. for pvtbuilds, then proxxy has those credentials baked into the AMI.
Test clients on the local network can then request those files from proxxy without authentication.
 
The secrets are stored in the repository, encrypted using [http://docs.ansible.com/playbooks_vault.html Ansible Vault].
 
=== Requirements ===
 
* [http://www.packer.io/ Packer 0.6+]
* Ansible Vault password in `.vaultpass` file
* AWS credentials in `AWS_ACCESS_KEY` and `AWS_SECRET_KEY` env vars
 
=== Workflow ===
 
View secrets:
 
cd ansible
  ./vault.sh view production
 
Edit secrets:
 
cd ansible
./vault.sh edit production
 
Test the production config in Vagrant:
 
'''Please keep in mind that this VM will be provisioned with production secrets, so you should keep it private!'''
 
# destroy existing VM, if present
vagrant destroy -f
 
# start up a fresh VM without provisioning
vagrant up --no-provision
 
# provision a fresh VM using Packer
cd packer
packer build -only vagrant proxxy.json
 
Once you've checked that VM works fine, you can build a fresh production AMI:
 
packer build -except vagrant proxxy.json
 
That command will produce one AMI per AWS region.
You can generate AMI for a specific region like this:
 
packer build -only ec2-usw2 proxxy.json
 
Once the AMIs are built, launch them:
 
'''Please keep in mind that these AMIs are provisioned with production secrets, so you should only launch them in a secure environment inaccessible from the public Internet.'''
 
* us-east-1:
** instance type: `c3.8xlarge`
** VPC ID: `vpc-b42100df`
** Subnet ID: `subnet-4fccc367`
** Auto-assign Public IP: Enable
** IAM role: `proxxy`
** Name: `proxxy-vpc-vXX` - XX is an incremented version number
** Security group: `proxxy-vpc` (`sg-d67f33b3`)
** SSH key pair: `proxxy`
* us-west-2:
** instance type: `c3.8xlarge`
** VPC ID: `vpc-cd63f2a4`
** Subnet ID: `subnet-3208e045`
** Auto-assign Public IP: Enable
** IAM role: `proxxy`
** Name: `proxxy-vpc-vXX` - XX is an incremented version number
** Security group: `proxxy-vpc` (`sg-ed803b88`)
** SSH key pair: `proxxy`
 
Once the instances are running, test them:
 
# Add `proxxy` key to your SSH agent.
# Using Mozilla VPN, SSH into an instance:
ssh ubuntu@<instance-private-ip>
# Check that proxxy works using curl:<p><code>curl -I -H 'Host: ftp.mozilla.org.example.com' http://127.0.0.1/</code></p><p>Look for the `X-Proxxy` header to see cache hit / miss status.</p>
# Repeat step 3 for each backend.
 
Once you're sure that every backend is proxied correctly, update DNS to put new proxxy instances into service.
 
 
== Production (SCL3) ==
 
Production environment in SCL3 is provisioned using Ansible.
 
=== Requirements ===
 
* [http://docs.ansible.com/ Ansible 1.6+]
* Ansible Vault password in `.vaultpass` file
* SSH access to `proxxy1.srv.releng.scl3.mozilla.com` from the machine running Ansible
 
=== Workflow ===
 
View secrets:
 
cd ansible
./vault.sh view scl3
 
Edit secrets:
 
cd ansible
./vault.sh edit scl3
 
Provision:
 
cd ansible
./provision.sh scl3 common.yml
./provision.sh scl3 proxxy.yml
 
Check that proxxy works:
 
ssh proxxy1.srv.releng.scl3.mozilla.com
curl -I -H 'Host: ftp.mozilla.org.example.com' http://127.0.0.1/
 
Look for the `X-Proxxy` header to see cache hit / miss status.

Navigation menu