Balrog: Difference between revisions

Balrog (view source)

Revision as of 20:59, 27 March 2018

1,058 bytes removed , 27 March 2018

→‎ELB Logs

Bhearsum

canmove, Confirmed users

6,441

edits

@@ Line 57: / Line 57: @@
 == ELB Logs ==
-=== Public app ===
+The production instance of Balrog publishes logs to two different S3 buckets:
-'''NOTE: ''' These instructions were written before [https://aws.amazon.com/blogs/aws/amazon-athena-interactive-sql-queries-for-data-in-amazon-s3/ Amazon Athena existed]. The next time we need to do such analysis, it's probably worth giving it a try. Other techniques may be better too - the instructions below are just something we've done in the past.
+* The nginx access logs (that contain all of the update requests we receive) are published to '''balrog-us-west-2-elb-logs'''. These logs are very large, and you're unlikely to be able to download them for local querying. The best way to work with them is through Athena.
+* The rest of the logs are published to '''net-mozaws-prod-us-west-2-logging-balrog''', in the "firehose/s3" directory. Within that there are subdirectories for different parts of Balrog:
-The ELB logs for the public-facing application are replicated to the '''balrog-us-west-2-elb-logs''' S3 bucket, located in us-west-2. Logs are rotated very quickly, and we end up with tens of thousands of separate files each day. Because of this, and the fact that S3 has a lot of overhead per-file, it can be tricky to do analysis on them. You're unlikely to be able to download the logs locally in any reasonable amount of time (ie, less than a day), but mounting them on an EC2 instance in us-west-2 should provide you with reasonably quick access. Here's an example:
+** balrog.admin.syslog.admin contains the admin wsgi app output.
-* Launch EC2 instance (you probably a compute-optimized one, and at least 100GB of storage).
+** balrog.admin.nginx.{access,error} contain the admin access & error logs from nginx. The access logs are generally a subset of the wsgi app output (which logs requests with a bit of extra detail).
-* Generate an access token for your CloudOps AWS account. If you don't have a CloudOps AWS account, talk to Ben Hearsum or Bensong Wong. Put the token in a plaintext file somewhere on the instance.
+** balrog.admin.syslog.agent contains the agent app output.
-** If you've chosen local storage, you'll probably need to format and mount volume.
+** balrog.admin.syslog.cron contains cronjob output (eg: the history cleanup and production database dump)
-* Install s3fs by following the instructions on https://github.com/s3fs-fuse/s3fs-fuse.
+** balrog.web.syslog.web contains the public wsgi app output. Note that this app does _not_ log requests, so this is largely warning/exception output. If you care about requests to the public app, use the nginx access logs (see above).
-* Mount the bucket on your instance, eg:
- s3fs balrog-us-west-2-elb-logs /media/bucket -o passwd_file=pw.txt
-* Do some broad grepping directly on the S3 logs, and store it in a local file. This should speed up subsequent queries. Eg:
- grep '/Firefox/.*WINNT.*/release/' /media/bucket/AWSLogs/361527076523/elasticloadbalancing/us-west-2/2016/09/17/* | gzip > /media/ephemeral0/sept-17-winnt-release.txt.gz
-* Do additional queries on the new logfile.
-=== Admin app ===
-Nginx logs for the admin app are available (on a ~1 day time delay) in the [https://console.aws.amazon.com/s3/buckets/net-mozaws-prod-us-west-2-logging-balrog/balrog.admin.nginx.access/?region=us-west-2&tab=overview "net-mozaws-prod-us-west-2-logging-balrog" S3 bucket]. These logs are small enough that downloading and querying them locally is generally the most efficient thing to do.
 == Backups ==

Balrog: Difference between revisions

Balrog (view source)

Revision as of 20:59, 27 March 2018

Navigation menu

Search