|
|
| Line 1: |
Line 1: |
| == Overview ==
| |
|
| |
|
| Proxxy is a basic HTTP cache used in each data center to reduce network transfers.
| |
| It's essentially an nginx instance pre-configured to work as a caching reverse proxy for whitelisted backend servers.
| |
|
| |
| Clients request files explicitly from the proxxy rather than relying on transparent network proxies, or HTTP_PROXY environment settings.
| |
| Since the proxxy instances can be handling multiple endpoints, we prepend the hostname of the original url to the proxxy url.
| |
| e.g. to fetch http://ftp.mozilla.org/foo/bar, the client would first check http://ftp.mozilla.org.proxxy.srv.releng.use1.mozilla.com.
| |
| Much of this logic is handled by mozharness' proxxy mixin.
| |
| If the file retrieval fails via proxxy, a fallback mechanism requests the file directly from the origin server.
| |
|
| |
| The reasons we chose to have such a setup, rather than a traditional proxy setup, include:
| |
| * (main reason): explicit is better than implicit - from the URL we can see which cache we are hitting
| |
| * transparent proxies are hard to debug or see what's going on
| |
| * using HTTP_PROXY or env vars may not be obvious in logging
| |
| * with traditional proxies it can be difficult to switch to use different backends, or offer multiple proxy instances
| |
|
| |
|
| |
| == Production ==
| |
|
| |
| Production environment uses servers puppetized with [http://hg.mozilla.org/build/puppet/file/production/modules/proxxy proxxy puppet module].
| |
| Each EC2 region has a single c3.2xlarge instance to handle the load, plus a single server in scl3.
| |
|
| |
| While EC2 proxxy instances have both public and private IPs assigned, incoming requests are restricted to only build, test and try machines (using EC2 security group rules).
| |
|
| |
| DNS is configured so that <code>*.proxxy.srv.releng.$REGION.mozilla.com</code> points to the corresponding proxxy instance.
| |
| See: https://inventory.mozilla.org/en-US/core/search/#q=proxxy
| |
|
| |
| EC2 proxxy instances can be accessed by SSH'ing to their internal IP from inside the build network.
| |
| Login as yourself using your own SSH key pair; once you're in, you can then switch to the root user. Access logs are written to syslog.
| |
|
| |
| If any authentication is required, e.g. for pvtbuilds, then proxxy has those credentials provisioned into proxxy nginx config located at `/etc/nginx/sites-enabled/proxxy`.
| |
| Test clients can then request those files from proxxy without authentication.
| |
|
| |
|
| |
| == Operations ==
| |
| === Purging the cache ===
| |
| You can force cache refresh for a specific URL by requesting it from one of the build machines with <code>X-Refresh: 1</code> header, like this:
| |
|
| |
| curl -H 'X-Refresh: 1' http://ftp.mozilla.org.proxxy1.srv.releng.use1.mozilla.com/some/url/to/refresh
| |
|
| |
| In case of emergency, you can also invalidate all cache for a specific domain (or all domains) by manually SSHing into proxxy instances and <code>rm -rf</code>ing corresponding directories from <code>/var/cache/proxxy</code>.
| |
| You should be careful as this may create a thundering herd problem for downstream servers.
| |
|
| |
| === Restarting / taking offline ===
| |
| The proxxy machines can be restarted safely. Clients will retry and fallback to other servers, or the canonical URL.
| |
|
| |
| If the machines are offline for too long, it can result in high network bandwidth and load on origin servers. Extended downtime should be coordinated with RelEng.
| |
|
| |
| === Logs ===
| |
| Logs are sent to papertrail
| |