ReleaseEngineering/How To/Unstick a Stuck Slave From A Master: Difference between revisions
< ReleaseEngineering | How To
Jump to navigation
Jump to search
(Created page with "{{Release Engineering How To|Unstick a Stuck Slave From A Master}} Sometimes slaves can be in various wedged states, which prevents a master reconfig. If this is the case, then ...") |
|||
| Line 5: | Line 5: | ||
= The Hard Way = | = The Hard Way = | ||
First, use | First, use lsof to figure out what file descriptor the socket it on: | ||
$ /usr/sbin/lsof -p $master_pid | grep linux-ix-slave05 | $ /usr/sbin/lsof -p $master_pid | grep linux-ix-slave05 | ||
buildbot 2788 cltbld 16u IPv4 471638980 TCP staging-master.build.mozilla.org:9012->linux-ix-slave05.build.mozilla.org:54714 (ESTABLISHED) | buildbot 2788 cltbld 16u IPv4 471638980 TCP staging-master.build.mozilla.org:9012->linux-ix-slave05.build.mozilla.org:54714 (ESTABLISHED) | ||
Revision as of 21:39, 21 April 2011
Sometimes slaves can be in various wedged states, which prevents a master reconfig.
If this is the case, then you need to convince the master to drop the connection to that slave, with prejudice.
The Hard Way
First, use lsof to figure out what file descriptor the socket it on:
$ /usr/sbin/lsof -p $master_pid | grep linux-ix-slave05 buildbot 2788 cltbld 16u IPv4 471638980 TCP staging-master.build.mozilla.org:9012->linux-ix-slave05.build.mozilla.org:54714 (ESTABLISHED)
The '16u' here gives the file descriptor within the master process (without the u)
Then, open up the manhole and:
>>> import os >>> os.close(16)
This will cause some weird tracebacks in the master log, but should come out fine in the end.