: New feature! MozillaWiki is now mobile-friendly. Visit from a mobile device to see new mobile theme + try editing. Release details.

ReleaseEngineering/Applications/Proxxy

From MozillaWiki
Jump to: navigation, search

Overview

Proxxy is a basic HTTP cache used in each data center to reduce network transfers. It's essentially an nginx instance pre-configured to work as a caching reverse proxy for whitelisted backend servers.

Clients request files explicitly from the proxxy rather than relying on transparent network proxies, or HTTP_PROXY environment settings. Since the proxxy instances can be handling multiple endpoints, we prepend the hostname of the original url to the proxxy url. e.g. to fetch http://ftp.mozilla.org/foo/bar, the client would first check http://ftp.mozilla.org.proxxy.srv.releng.use1.mozilla.com. Much of this logic is handled by mozharness' proxxy mixin. If the file retrieval fails via proxxy, a fallback mechanism requests the file directly from the origin server.

The reasons we chose to have such a setup, rather than a traditional proxy setup, include:

  • (main reason): explicit is better than implicit - from the URL we can see which cache we are hitting
  • transparent proxies are hard to debug or see what's going on
  • using HTTP_PROXY or env vars may not be obvious in logging
  • with traditional proxies it can be difficult to switch to use different backends, or offer multiple proxy instances


Production

Production environment uses servers puppetized with proxxy puppet module. Each EC2 region has a single c3.8xlarge instance to handle the load, plus a single server in scl3.

While EC2 proxxy instances have both public and private IPs assigned, incoming requests are restricted to only build, test and try machines (using EC2 security group rules).

DNS is configured so that *.proxxy.srv.releng.$REGION.mozilla.com points to the corresponding proxxy instance. See: https://inventory.mozilla.org/en-US/core/search/#q=proxxy

EC2 proxxy instances can be accessed by SSH'ing to their internal IP from inside the build network. Login as root using the proxxy SSH key from the private releng repo. Access logs are written to syslog.

If any authentication is required, e.g. for pvtbuilds, then proxxy has those credentials provisioned into proxxy nginx config located at `/etc/nginx/sites-enabled/proxxy`. Test clients can then request those files from proxxy without authentication.


Operations

Purging the cache

You can force cache refresh for a specific URL by requesting it from one of the build machines with X-Refresh: 1 header, like this:

curl -H 'X-Refresh: 1' http://ftp.mozilla.org.proxxy1.srv.releng.use1.mozilla.com/some/url/to/refresh

In case of emergency, you can also invalidate all cache for a specific domain (or all domains) by manually SSHing into proxxy instances and rm -rfing corresponding directories from /var/cache/proxxy. You should be careful as this may create a thundering herd problem for downstream servers.

Restarting / taking offline

The proxxy machines can be restarted safely. Clients will retry and fallback to other servers, or the canonical URL.

If the machines are offline for too long, it can result in high network bandwidth and load on origin servers. Extended downtime should be coordinated with RelEng.

Logs

Logs are sent to papertrail