Account confirmers, Anti-spam team, Confirmed users, Bureaucrats and Sysops emeriti
1,529
edits
No edit summary |
|||
| Line 7: | Line 7: | ||
Nagios will alert us in channel (and send email) after the it hits the retry limit for ping attempts. | Nagios will alert us in channel (and send email) after the it hits the retry limit for ping attempts. | ||
See the section [[ | See the section [[#Reboot_a_tegra|power cycle a tegra]]. | ||
=== tegra agent check is CRITICAL === | === tegra agent check is CRITICAL === | ||
| Line 24: | Line 24: | ||
https://secure.pub.build.mozilla.org/buildapi/recent/tegra-338 | https://secure.pub.build.mozilla.org/buildapi/recent/tegra-338 | ||
* If it's burning builds, connect to the associated foopy listed in the dashboard and [[#Disable a tegra|stop the tegra(s). | * If it's burning builds, connect to the associated foopy listed in the dashboard and [[#Disable a tegra|stop the tegra(s)]]. | ||
<!-- | <!-- | ||
TODO: This maintenance script needs updating, but docs will be (almost) perfect when done, so don't remove from page | TODO: This maintenance script needs updating, but docs will be (almost) perfect when done, so don't remove from page | ||
| Line 50: | Line 49: | ||
= Basic tegra management = | = Basic tegra management = | ||
== Find what foopy a Tegra is on == | == Find what foopy a Tegra is on == | ||
Open the Tegra Dashboard - the foopy number is shown to the right | Open the [http://mobile-dashboard.pub.build.mozilla.org/ Tegra Dashboard] - the foopy number is shown to the right | ||
== Check status of Tegra(s) == | == Check status of Tegra(s) == | ||
Find the Tegra on the [[ | Find the Tegra on the [[#Find_what_foopy_a_Tegra_is_on|Dashboard]] and then ssh to that foopy | ||
ssh cltbld@foopy## | ssh cltbld@foopy## | ||
| Line 66: | Line 65: | ||
== Clear an error flag == | == Clear an error flag == | ||
This is done automatically, once an hour. But if you need to do it manually for some reason... | |||
Find the Tegra on the Dashboard, ssh to that foopy and then | Find the Tegra on the Dashboard, ssh to that foopy and then | ||
ssh cltbld@foopy05 | ssh cltbld@foopy05 | ||
rm -f /builds/tegra-NNN/error.flg | |||
== start Tegra(s) == | |||
Find out which foopy server you need to be on and then run: | Find out which foopy server you need to be on and then run: | ||
cd /builds | cd /builds | ||
rm -f /builds/tegra-###/{disabled,error}.flg | |||
The device should then attempt to startup within 5 minutes, running through verify then starting buildbot it verify succeeds. | |||
Should it seem to have trouble starting, you can check its watcher log: | |||
tail /builds/tegra-###/watcher.log | |||
And if that is stale you might want to peek at [[#Recover_a_foopy|recover a foopy]] | |||
== Disable a tegra == | |||
First find the foopy server for the Tegra and then run: | First find the foopy server for the Tegra and then run: | ||
cd /builds | cd /builds | ||
touch tegra-NNN/disabled.flg | |||
This will then stop the device within 5 minutes, at the next watch_devices cycle. | |||
to | Should it seem to have trouble starting, you can check its watcher log: | ||
tail /builds/tegra-###/watcher.log | |||
And if that is stale you might want to peek at [[#Recover_a_foopy|recover a foopy]] | |||
== Reboot a tegra == | == Reboot a tegra == | ||
| Line 120: | Line 109: | ||
If rebooting via PDU does not clear the problem, here are things to try: | If rebooting via PDU does not clear the problem, here are things to try: | ||
* reboot again - fairly common to have 2nd one clear it | * reboot again - fairly common to have 2nd one clear it | ||
** especially if | ** especially if device responsive to ping & telnet (port 20701) after first reboot | ||
== Recover a foopy == | == Recover a foopy == | ||
| Line 130: | Line 119: | ||
screen -x | screen -x | ||
cd /builds | cd /builds | ||
rm -f tegra-*/watcher.lock | |||
./ | ./watch_devices.sh | ||
= Advanced tegra management = | = Advanced tegra management = | ||
edits