Connect and Troubleshoot workers in CI: Difference between revisions

Jump to navigation Jump to search
m
quick updates
m (quick updates)
m (quick updates)
Line 7: Line 7:
If we cannot ssh into OSX nodes, we can try to restart them from Taskcluster. But if they are not visible in the Taskcluster worker explorer, then you can create them using this version of [https://github.com/davehouse/relops-infra/blob/quarantine_nonexisting/quarantine_tc.py quarantine script] that will add/define a worker if it is missing.
If we cannot ssh into OSX nodes, we can try to restart them from Taskcluster. But if they are not visible in the Taskcluster worker explorer, then you can create them using this version of [https://github.com/davehouse/relops-infra/blob/quarantine_nonexisting/quarantine_tc.py quarantine script] that will add/define a worker if it is missing.
* Step 1: [[BuildDuty:TaskClusterCli|Connect to Taskcluster CLI]]  
* Step 1: [[BuildDuty:TaskClusterCli|Connect to Taskcluster CLI]]  
* Step 2: Use this command: e.g.  <pre>python quarantine_tc.py --enable -p releng-hardware -w gecko-t-osx-1010 -g mdc2 t-yosemite-r7-449</pre>
* Step 2: Use this command: e.g.  <code>python quarantine_tc.py --enable -p releng-hardware -w gecko-t-osx-1010 -g mdc2 t-yosemite-r7-449</code>


After the steps above the worker explorer will show the machine and you can reboot it from there, using [[ReleaseEngineering/How To/RelOps Hardware Controller (Roller)|roller]]
After the steps above the worker explorer will show the machine and you can reboot it from there, using [[ReleaseEngineering/How To/RelOps Hardware Controller (Roller)|roller]]
Line 27: Line 27:
* Check if the host responds to ping.
* Check if the host responds to ping.
* Connect to the worker using SSH:
* Connect to the worker using SSH:
** check if the worker process is running: <pre>ps -ef|grep</pre>
** check if the worker process is running: <code>ps -ef|grep</code>
** check the logs: <pre> top -u </pre> to see if there are high CPU usage from something other than python or firefox
** check the logs: <code> top -u </code> to see if there are high CPU usage from something other than python or firefox


= Rebooting workers =
= Rebooting workers =
Confirmed users
14

edits

Navigation menu