ReleaseEngineering/How To/Restart Buildbot Masters: Difference between revisions
ChrisCooper (talk | contribs) (Created page with "__TOC__ We occasionally need to restart buildbot masters for various reasons: * upgrades to the underlying OS * gradual increase in memory usage over time, leading to reduced...") |
ChrisCooper (talk | contribs) No edit summary |
||
Line 19: | Line 19: | ||
= By script = | = By script = | ||
The above actions have been encapsulated into a script: https://hg.mozilla.org/build/tools/file/ | The above actions have been encapsulated into a script: https://hg.mozilla.org/build/tools/file/default/buildfarm/maintenance/restart_masters.py | ||
The script requires a [[ReleaseEngineering/Buildduty/Reconfigs#How_to_reconfig|bash-format config file like the one used by the end_to_end_reconfig.sh script]]. At the very least the config file must define values for LDAP_USERNAME, LDAP_PASSWORD, and CLTBLD_PASSWORD. | The script requires a [[ReleaseEngineering/Buildduty/Reconfigs#How_to_reconfig|bash-format config file like the one used by the end_to_end_reconfig.sh script]]. At the very least the config file must define values for LDAP_USERNAME, LDAP_PASSWORD, and CLTBLD_PASSWORD. | ||
The script is currently setup to run on dev-master2 in a venv under coop's account. We are in the process of moving this env to a shared user on the buildduty-tools machine ({{bug|1299421}}). | |||
Here is an example invocation: | |||
# dev-master2 | |||
$ screen -R restart_masters | |||
$ cd ~coop/restart_masters | |||
$ source bin/activate | |||
$ cd tools/buildfarm/maintenance/ | |||
$ ./restart_masters.py -v -m production-masters.json | |||
= Automated = | = Automated = | ||
The above script | The above script requires sensitive credentials that shouldn't be stored on disk. For now, we're still running this script by hand. |
Revision as of 21:37, 15 September 2016
We occasionally need to restart buildbot masters for various reasons:
- upgrades to the underlying OS
- gradual increase in memory usage over time, leading to reduced master performance
Manually
If you need to restart a single master by hand, here's the sequence you should follow:
- disable the master in slavealloc. This prevents the master from taking more slave connections while you're waiting for it to shutdown.
- click the "Clean Shutdown" button on the web interface for the given master, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/
- wait for the jobs currently running on that master to complete. You can track progress by searching in-page for "Running" on the master's buildslaves page, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/buildslaves?no_builders=1
- once the master is shutdown, perform whatever upgrades are required, etc.
- restart the master. """NOTE:""" buildbot masters are configured to restart buildbot automatically on boot, so if you reboot the master, buildbot will restart itself. To restart manually:
xebec:buildduty ccooper$ ssh cltbld@buildbot-master82 Unauthorized access prohibited [cltbld@buildbot-master82.bb.releng.scl3.mozilla.com ~]$ cd /builds/buildbot/build1/ [cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ source bin/activate (build1)[cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ make start
- re-enable the master in slavealloc.
By script
The above actions have been encapsulated into a script: https://hg.mozilla.org/build/tools/file/default/buildfarm/maintenance/restart_masters.py
The script requires a bash-format config file like the one used by the end_to_end_reconfig.sh script. At the very least the config file must define values for LDAP_USERNAME, LDAP_PASSWORD, and CLTBLD_PASSWORD.
The script is currently setup to run on dev-master2 in a venv under coop's account. We are in the process of moving this env to a shared user on the buildduty-tools machine (bug 1299421).
Here is an example invocation:
# dev-master2 $ screen -R restart_masters $ cd ~coop/restart_masters $ source bin/activate $ cd tools/buildfarm/maintenance/ $ ./restart_masters.py -v -m production-masters.json
Automated
The above script requires sensitive credentials that shouldn't be stored on disk. For now, we're still running this script by hand.