A reconfig (short for "reconfigure the buildbot masters") is how changes to buildbot configurations make it into production. Buildduty is responsible for running reconfigs.
In a nutshell, a reconfig consists of:
- moving the production tag for the buildbot-configs and mozharness repositories, and the production-0.8 tag for the buildbotcustom to the current tip (or chosen revision)
- updating the pinned version of mozharness for mozilla-central
- updating the source checkout of both repositories on all the buildbot masters
- updating the tools checkout on each master to the current tip
- executing a buildbot reconfig command on each buildbot master
Please see the instructions for how to land buildbot changes for more information.
It is polite to ask in #releng if anyone has further changes to land before starting the reconfig process.
How to reconfig
The current state of the art is to use the end_to_end_reconfig.sh script.
The end_to_end_reconfig.sh script uses the original fabric scripts as it's core, but also takes care of updating the wiki, updating bugs affected by the merge, and updating the tools checkouts on foopies as well. As such, the script has a few, indirect python module dependencies:
However, the script will try to install the dependencies automatically if they are missing.
In order to update the wiki/bugzilla, the script also requires your credentials for those services in a config file:
# Needed if updating wiki - note the wiki does *not* use your LDAP credentials... export WIKI_USERNAME=XXX export WIKI_PASSWORD=XXXXXXXX # Needed if updating Bugzilla bugs to mark them as in production - *no* Persona integration - must be a native Bugzilla account.. . # Details for the 'Release Engineering SlaveAPI Service' <firstname.lastname@example.org> Bugzilla user can be found in the RelEng # private repo, in file passwords/slaveapi-bugzilla.txt.gpg (needs decrypting with an approved gpg key). export BUGZILLA_USERNAME='XXX@mozilla.com' export BUGZILLA_PASSWORD='XXXXXXXXX' # Used for slaveapi actions export LDAP_USERNAME='XXX@mozilla.com' export LDAP_PASSWORD='XXXXXXXX'
Again, the script will create a template config file (default is ~/.reconfig/config) if it does not exist, but you'll need to fill in your own credentials before it will work.
The script will also attempt to update IRC with reconfig status. To do so, it uses a minimal bash IRC client called ii. You can download ii from their website, or install it via a package manager (e.g. port install ii). Updating irc is non-fatal, but make sure ii is in your PATH if you want it to work.
Updating the pinned version of mozharness on mozilla-central
The revision of mozharness used by a particular branch of mozilla code is now tracked in-tree. As a courtesy to developers and sheriffs, buildduty is expected to update the pinned revision in the mozilla-central integration branch when they move the
production tag. The change will be merged from mozilla-central to other branches by sheriffs as part of their normal duties.
The pinned revision is tracked in this file: http://hg.mozilla.org/mozilla-central/file/920ded6a1f77/testing/mozharness/mozharness.json
Update the revision in the file to point to the revision of the new mozharness
production tag and land normally.
NOTE: once all of mozharness moves in-tree, this step will be unnecessary.
If you have added/removed a platform that will change the content of master/master_config.json in tools/buildfarm/maintenance/production_masters.json, you'll need to manually update the masters that this change impacts because the end_to_end_reconfig.sh script does not do this step. bug 1215294 opened to enable this in the script.
cd /builds/buildbot/tests1-linux64/ export PRODUCTION_MASTERS=tools/buildfarm/maintenance/production-masters.json python buildbot-configs/update-master-json.py $PRODUCTION_MASTERS master/master_config.json make checkconfig make reconfig
Help, my reconfig failed!
Assuming you're using the end_to_end_reconfig.sh script, you can resume after fixing the error. Errors and exit state can be found in the manage_masters-##########.log which is created in /tmp/reconfig by default.
Running the script again will yield the following menu:
* Please select one of the following options: 1) Continue with existing reconfig (e.g. if you have resolved a merge conflict) 2) Delete saved state for existing reconfig, and start from fresh 3) Abort and exit reconfig process
Number 1 is usually the best option here, especially if the hg operations actually succeeded during the previous attempt. This way the wiki/bugzilla updates will still be properly applied.
Help, my reconfig is stuck!
Reconfigs can take as little as 30 minutes to run, but can take up to 2 hours depending on how busy the systems are.
In general, linux test masters are the slowest to reconfig. You can tail the manage_masters-##########.log to keep up with progress. By default, this log is created in /tmp/reconfig.
If the reconfig gets stuck, see How To/Unstick a Stuck Slave From A Master.