Devops/monitoring-alerting
From MozillaWiki
< Devops
Mozilla Foundation Monitoring & Alerting
TLDR
- Performance and Most Infrastructure Monitoring in New Relic
- New Relic Dashboards are a good way to get info fast (login required)
- Load balancer health, database, and overall application healthchecks in Opsview (login required) (Public Viewport)
- Many dashboards for traffic, metrics, performance, and monitoring on this Dashboard of Dashboards
- For accounts, questions, or suggestions, email jp at mozillafoundation.org
MONITORING TOOLS, SYSTEMS, AND LINKS
- Opsview, a Nagios clone with a much friendlier interface.
- * Monitors and alerts when servers in load balancers are unhealthy
- * Monitors and alerts on uptime/downtime of overall endpoints, such as https://webmaker.org
- * Monitors and alerts on database utilization and downtime.
- Important Opsview Links
- Public Status Page
- Current Unhandled Alerts (Login required)
- Recent Alerts in Opsview
- !!!TODO : Add the guide for notifications & contact settings
- New Relic monitoring (Login Required)
- * Watching application response time in browser and server side
- * Watching database and web server utilization, transactions, timings, and throughput
- * Watching load balancer (ELB) metrics
- * Performing serverside and client-side tracing of long running transactions
- * Overall endpoint monitoring, such as https://webmaker.org
- * Watching cache server utilization and metrics
- * Watching Elasticsearch server utilization and metrics
- * Watching Mongo server utilization and metrics
- * Marks and compares new/old deployed versions of software
- !!!TODO : Add the guide for notifications & contact settings
- Important New Relic Links
- New Relic Dashboards
- Recent New Relic Alerts
- New Relic Applications Overview
- Recent Deployments
- Browser / Front-end Performance Overview
- Log monitoring with Loggins (Kibana) (Login Required)
- AWS Infrastructure and Autoscaling Monitoring/Alerting
- * An email group exists to be notified of any autoscaling activities (up or down). Contact jp at mozillafoundation.org to be added to this list.
- * Cloudwatch in the AWS console is capable of monitoring many metrics and utilization metrics, including CPU usage or network usage for a group, database, server, or ELB. Not many alarms are triggered from this outside of to trigger scaling.
- Most AWS infrastructure is monitored via New Relic. See the side menu options in New Relic for RDS, ELB, EC2, Elasticache, etc...