Community Ops/Monitoring: Difference between revisions
Jump to navigation
Jump to search
(→Tools) |
(we don't use NR anymore) |
||
Line 8: | Line 8: | ||
'''We use a number of services to maintain effective monitoring:''' | '''We use a number of services to maintain effective monitoring:''' | ||
* Pingdom - Checks that our servers are up. Screams if they aren't. | * Pingdom - Checks that our servers are up. Screams if they aren't. | ||
* VictorOps - Incident Response Management. Dispatches alerts to sysadmins and compiles a nice timeline for us to manage incidents. | * VictorOps - Incident Response Management. Dispatches alerts to sysadmins and compiles a nice timeline for us to manage incidents. |
Revision as of 16:21, 6 September 2015
Monitoring Setup
General
Monitoring is decentralized. Incident Response is distributed based on timezone, availability and project knowledge.
Tools
We use a number of services to maintain effective monitoring:
- Pingdom - Checks that our servers are up. Screams if they aren't.
- VictorOps - Incident Response Management. Dispatches alerts to sysadmins and compiles a nice timeline for us to manage incidents.
How to use it
TBD
How to request monitoring
TBD