Proxxy is a basic HTTP cache used in each data center to reduce network transfers. It's essentially an nginx instance pre-configured to work as a caching reverse proxy for whitelisted backend servers.
Clients request files explicitly from the proxxy rather than relying on transparent network proxies, or HTTP_PROXY environment settings. Since the proxxy instances can be handling multiple endpoints, we prepend the hostname of the original url to the proxxy url. e.g. to fetch http://ftp.mozilla.org/foo/bar, the client would first check http://ftp.mozilla.org.proxxy.srv.releng.use1.mozilla.com. Much of this logic is handled by mozharness' proxxy mixin. If the file retrieval fails via proxxy, a fallback mechanism requests the file directly from the origin server.
The reasons we chose to have such a setup, rather than a traditional proxy setup, include:
- (main reason): explicit is better than implicit - from the URL we can see which cache we are hitting
- transparent proxies are hard to debug or see what's going on
- using HTTP_PROXY or env vars may not be obvious in logging
- with traditional proxies it can be difficult to switch to use different backends, or offer multiple proxy instances
Production environment uses servers puppetized with proxxy puppet module. Each EC2 region has a single c3.2xlarge instance to handle the load, plus a single server in scl3.
While EC2 proxxy instances have both public and private IPs assigned, incoming requests are restricted to only build, test and try machines (using EC2 security group rules).
DNS is configured so that
*.proxxy.srv.releng.$REGION.mozilla.com points to the corresponding proxxy instance.
EC2 proxxy instances can be accessed by SSH'ing to their internal IP from inside the build network. Login as yourself using your own SSH key pair; once you're in, you can then switch to the root user. Access logs are written to syslog.
If any authentication is required, e.g. for pvtbuilds, then proxxy has those credentials provisioned into proxxy nginx config located at `/etc/nginx/sites-enabled/proxxy`. Test clients can then request those files from proxxy without authentication.
Purging the cache
You can force cache refresh for a specific URL by requesting it from one of the build machines with
X-Refresh: 1 header, like this:
curl -H 'X-Refresh: 1' http://ftp.mozilla.org.proxxy1.srv.releng.use1.mozilla.com/some/url/to/refresh
In case of emergency, you can also invalidate all cache for a specific domain (or all domains) by manually SSHing into proxxy instances and
rm -rfing corresponding directories from
You should be careful as this may create a thundering herd problem for downstream servers.
Restarting / taking offline
The proxxy machines can be restarted safely. Clients will retry and fallback to other servers, or the canonical URL.
If the machines are offline for too long, it can result in high network bandwidth and load on origin servers. Extended downtime should be coordinated with RelEng.
Logs are sent to papertrail