Confirmed users
17
edits
Ianconnolly (talk | contribs) |
Ianconnolly (talk | contribs) |
||
| Line 33: | Line 33: | ||
== Long-term To-Dos/Stretch goals == | == Long-term To-Dos/Stretch goals == | ||
<i>Currently we reboot after almost every job on buildbot in order to do things like: | |||
* Make sure we re-puppetize | |||
* Clean up temporary files | |||
* Make sure no old processes are running | |||
* Clean up memory fragmentation | |||
However, by rebooting, we cause some problems: | |||
* We lose the filesystem cache between every job. In AWS this turns into lots of extra IO to read back the same files over and over after each reboot | |||
* We waste 2-5 minutes per job doing a reboot | |||
* Extra load on puppet masters | |||
We can address nearly all of the issues we reboot for in pre-flight checks: | |||
* Check if there are puppet (or AMI) changes that need to be applied | |||
* We can still clean up temporary files | |||
* We can kill stray processes | |||
* I don't think memory fragmentation is an issue any more. We used to have problems on 32-bit linux machines that were up for a long time. Eventually they weren't able to link large libraries. All our build machines are 64-bit now I believe. | |||
<b>This will require that 'runner' be in charge of starting and stopping buildbot</b>. I imagine we'd do something like this: | |||
* Run through pre-flight checks | |||
* Start buildbot | |||
* Watch twistd.log, make sure buildbot actually starts and connects to a master | |||
* Initiate graceful shutdown of buildbot after X minutes (30?). There are ways to do this locally (e.g. by touching a shutdown.stamp file) instead of poking the buildbot master. | |||
* Run any post-flight tasks | |||
* Go back to beginning</i> | |||
From [https://bugzilla.mozilla.org/show_bug.cgi?id=1028191 Bug 1028191 - Stop rebooting after every job] | |||
== Stuff I Need == | == Stuff I Need == | ||