Confirmed users
22
edits
m (Updated BuildDuty to CiDuty. Renaming project) |
|||
| Line 4: | Line 4: | ||
== Continuously == | == Continuously == | ||
'''1.''' Nagios and SNS Alerts - Monitor alerts from | '''1.''' Nagios and SNS Alerts - Monitor alerts from [https://nagios1.private.releng.mdc1.mozilla.com/releng-mdc1/cgi-bin/status.cgi?host=all&servicestatustypes=28/ MDC1] Nagios instances and SNS alerts from [https://papertrailapp.com/dashboard/ papertrail] in the #ci IRC channel. Triage unacknowledged alerts and file/fix bugs as necessary according to the [https://wiki.mozilla.org/ReleaseEngineering/How_To/ Release Engineering How-Tos]. Make sure that all CiDuty bugs have the correct priority set according to the [[ReleaseEngineering/Buildduty_actionable#Buildduty_Bugzilla_Priority_levels/pending_counts| priority list]] below. A few examples of such alerts include: | ||
* High CI wait times | * High CI wait times | ||
* CI pending job backlogs | * CI pending job backlogs | ||
* Relengbot failures | * Relengbot failures | ||
* Golden AMI generation failures | * Golden AMI generation failures | ||
* Unresponsive machines | * Unresponsive machines | ||
* | * Disk/RAM/CPU issues | ||
* Failed processes | * Failed processes | ||
'''2.''' Monitor the #releng and #taskcluster irc channels for requests/questions from developers and other ops teams. | '''2.''' Monitor the #releng and #taskcluster irc channels for requests/questions from developers and other ops teams. | ||