Community Ops/Monitoring: Difference between revisions
Jump to navigation
Jump to search
(we don't use NR anymore) |
(adding monitoring tools) |
||
| Line 16: | Line 16: | ||
===How to request monitoring=== | ===How to request monitoring=== | ||
TBD | TBD | ||
== Monitoring == | |||
{| class="wikitable" | |||
|- | |||
! Tool !! Usage !! Primary Contact !! Secondary Contacts | |||
|- | |||
| Pingdom || Uptime and latency monitoring || [https://mozillians.org/en-US/u/mrz mrz] | |||
|- | |||
| VictorOps || Incident Escalation and notifications || tanner || [https://mozillians.org/en-US/u/mrz mrz], tad, logan, yousef, pancakes, dumitru | |||
|- | |||
| Cloudwatch || Top Level Monitoring of AWS || Same as AWS | |||
|- | |||
| StatusHub || Dashboard || [https://mozillians.org/en-US/u/mrz mrz] || | |||
|- | |||
| New Relic || Application Monitoring || jp || tad, tanner, logan, yousef, [https://mozillians.org/en-US/u/mrz mrz] | |||
|} | |||
Revision as of 18:01, 12 September 2015
Monitoring Setup
General
Monitoring is decentralized. Incident Response is distributed based on timezone, availability and project knowledge.
Tools
We use a number of services to maintain effective monitoring:
- Pingdom - Checks that our servers are up. Screams if they aren't.
- VictorOps - Incident Response Management. Dispatches alerts to sysadmins and compiles a nice timeline for us to manage incidents.
How to use it
TBD
How to request monitoring
TBD
Monitoring
| Tool | Usage | Primary Contact | Secondary Contacts |
|---|---|---|---|
| Pingdom | Uptime and latency monitoring | mrz | |
| VictorOps | Incident Escalation and notifications | tanner | mrz, tad, logan, yousef, pancakes, dumitru |
| Cloudwatch | Top Level Monitoring of AWS | Same as AWS | |
| StatusHub | Dashboard | mrz | |
| New Relic | Application Monitoring | jp | tad, tanner, logan, yousef, mrz |