CiDuty/actionable: Difference between revisions

Jump to navigation Jump to search
m (Updated BuildDuty to CiDuty. Renaming project)
Line 4: Line 4:


== Continuously ==
== Continuously ==
'''1.''' Nagios and SNS Alerts - Monitor alerts from [https://nagios.mozilla.org/releng-scl3/cgi-bin/status.cgi?host=all&servicestatustypes=28&hoststatustypes=15&serviceprops=270346&hostprops=270346/ SCL3] and [https://nagios1.private.releng.mdc1.mozilla.com/releng-mdc1/cgi-bin/status.cgi?host=all&servicestatustypes=28/ MDC1] Nagios instances and SNS alerts from [https://papertrailapp.com/dashboard/ papertrail] in the #ci IRC channel. Triage unacknowledged alerts and file/fix bugs as necessary according to the [https://wiki.mozilla.org/ReleaseEngineering/How_To/ Release Engineering How-Tos]. Make sure that all CiDuty bugs have the correct priority set according to the [[ReleaseEngineering/Buildduty_actionable#Buildduty_Bugzilla_Priority_levels/pending_counts| priority list]] below. A few examples of such alerts include:
'''1.''' Nagios and SNS Alerts - Monitor alerts from [https://nagios1.private.releng.mdc1.mozilla.com/releng-mdc1/cgi-bin/status.cgi?host=all&servicestatustypes=28/ MDC1] Nagios instances and SNS alerts from [https://papertrailapp.com/dashboard/ papertrail] in the #ci IRC channel. Triage unacknowledged alerts and file/fix bugs as necessary according to the [https://wiki.mozilla.org/ReleaseEngineering/How_To/ Release Engineering How-Tos]. Make sure that all CiDuty bugs have the correct priority set according to the [[ReleaseEngineering/Buildduty_actionable#Buildduty_Bugzilla_Priority_levels/pending_counts| priority list]] below. A few examples of such alerts include:
* High CI wait times
* High CI wait times
* CI pending job backlogs
* CI pending job backlogs
* Buildbot misconfigurations
* Relengbot failures
* Relengbot failures
* Golden AMI generation failures
* Golden AMI generation failures
* Unresponsive machines
* Unresponsive machines
* DIsk/RAM/CPU issues
* Disk/RAM/CPU issues
* Failed processes
* Failed processes
* Buildbot master process age


'''2.''' Monitor the #releng and #taskcluster irc channels for requests/questions from developers and other ops teams.
'''2.''' Monitor the #releng and #taskcluster irc channels for requests/questions from developers and other ops teams.
Confirmed users
22

edits

Navigation menu