CIDuty/Reconfigs: Difference between revisions

m
Jlund moved page Buildduty/Reconfigs to CIDuty/Reconfigs: changing team name
m (Jlund moved page Buildduty/Reconfigs to CIDuty/Reconfigs: changing team name)
 
(24 intermediate revisions by 7 users not shown)
Line 1: Line 1:
Buildduty is responsible for reconfig-ing the Buildbot masters to get release engineering code changes into production.
__TOC__
{{Release Engineering How To|Reconfigs}}


The person doing reconfigs should also update the [https://wiki.mozilla.org/ReleaseEngineering:Maintenance#Reconfigs_.2F_Deployments reconfig deployments page].
A reconfig (short for "reconfigure the buildbot masters") is how changes to buildbot configurations make it into production. CiDuty is responsible for running reconfigs.


= Scheduled reconfigs =
In a nutshell, a reconfig consists of:
Scheduled reconfigs are *supposed* to happen <b>every Monday and Thursday</b>. During this, buildduty needs to merge default -> production branches and reconfig the affected masters. [[ReleaseEngineering/Landing_Buildbot_Master_Changes|The Landing Buildbot Master Changes wiki page]] has step by step instructions.
* moving the ''production'' tag for the [https://hg.mozilla.org/build/buildbot-configs buildbot-configs] repository, and the ''production-0.8'' tag for the [https://hg.mozilla.org/build/buildbotcustom buildbotcustom] to the current tip (or chosen revision)
* updating the source checkout of both repositories on all the buildbot masters
* updating the [https://hg.mozilla.org/build/tools tools] checkout on each master to the current tip
* executing a buildbot reconfig command on each buildbot master


It is also valid to do other additional reconfigs anytime you want. Other release engineers may have important changes to land that don't coincide with the scheduled reconfigs.
Please see the [[ReleaseEngineering/How_To/Land_Buildbot_Master_Changes|instructions for how to land buildbot changes]] for more information.


It is polite to ask in #mozbuild if anyone has further changes to land before starting the reconfig process.
It is polite to ask in #releng if anyone has further changes to land before starting the reconfig process.


While merging buildbot-configs default -> production and buildbotcustom default -> production-0.8, merge mozharness default -> production as well.
= How to reconfig =
The current state of the art is to use the [https://hg.mozilla.org/build/tools/file/default/buildfarm/maintenance/end_to_end_reconfig.sh end_to_end_reconfig.sh] script. To see all the available options the script provides:
<pre>
bash ./end_to_end_reconfig.sh -h
</pre>  


If there are changes to tools that impact the foopies, you should update to the [[ReleaseEngineering/How_To/Android_Tegras#Update_software_on_the_foopies|latest version of code on the foopies]].
The end_to_end_reconfig.sh script uses the original [https://wiki.mozilla.org/ReleaseEngineering/Managing_Buildbot_with_Fabric fabric scripts] as it's core, but also takes care of updating the wiki, updating bugs affected by the merge, and updating the tools checkouts on foopies as well. As such, the script has a few, indirect python module dependencies:
* fabric
* requests
However, the script will try to install the dependencies automatically if they are missing.


= How to reconfig =
In order to update the wiki/bugzilla, the script also requires your credentials for those services in a config file:
You should [https://wiki.mozilla.org/ReleaseEngineering/Managing_Buildbot_with_Fabric use Fabric to do the reconfig]. That's kind of old school and error prone.  Even better - use the script pmoore wrote. It's full of awesome. Unicorns oo.  {{bug|1018248}} See http://hg.mozilla.org/build/tools/file/7401eb2160cf/buildfarm/maintenance/end_to_end_reconfig.sh/end_to_end_reconfig.sh
 
# Needed if updating wiki - note the wiki does *not* use your LDAP credentials...
export WIKI_USERNAME=XXX
export WIKI_PASSWORD=XXXXXXXX
# Needed if updating Bugzilla bugs to mark them as in production - *no* Persona integration - must be a  native Bugzilla account.. .
# Details for the 'Release Engineering SlaveAPI Service' <slaveapi@mozilla.releng.tld> Bugzilla user can be found in the RelEng
# private repo, in file passwords/slaveapi-bugzilla.txt.gpg (needs decrypting with an approved gpg key).
export BUGZILLA_USERNAME='XXX@mozilla.com'
export BUGZILLA_PASSWORD='XXXXXXXXX'
# Used for slaveapi actions
export LDAP_USERNAME='XXX@mozilla.com'
export LDAP_PASSWORD='XXXXXXXX'
 
The script will also:
* create a template config file (default is ''~/.reconfig/config'') if it does not exist, but you'll need to fill in your own credentials before it will work.
* attempt to update IRC with reconfig status. To do so, it uses a minimal bash IRC client called [http://tools.suckless.org/ii/ ii]. You can [http://tools.suckless.org/ii/ download ii from their website], or install it via a package manager (e.g. port install ii). Updating irc is non-fatal, but make sure ''ii'' is in your PATH if you want it to work.
* create a temporary folder (''/tmp/reconfig'') to store the reconfig files. When starting a new reconfig, you should delete the reconfig folder corresponding to an older run of the script before attempting to run it again.
 
= Updating the pinned version of mozharness on mozilla-central =
The revision of [https://hg.mozilla.org/build/mozharness mozharness] used by a particular branch of mozilla code is now tracked in-tree. As a courtesy to developers and sheriffs, CiDuty is expected to update the pinned revision in the mozilla-central integration branch when they move the <code>production</code> tag. The change will be merged from mozilla-central to other branches by sheriffs as part of their normal duties.
 
The pinned revision is tracked in this file: http://hg.mozilla.org/mozilla-central/file/920ded6a1f77/testing/mozharness/mozharness.json
 
Update the revision in the file to point to the revision of the new mozharness <code>production</code> tag and land normally.
 
NOTE: once all of mozharness moves in-tree, this step will be unnecessary.
 
= Updating master/master_config.json =
 
If you have added/removed a platform that will change the content of master/master_config.json in tools/buildfarm/maintenance/production_masters.json, you'll need to manually update the masters that this change impacts because the end_to_end_reconfig.sh script does not do this step. {{bug|1215294}} opened to enable this in the script.
 
Example
<pre>
cd /builds/buildbot/tests1-linux64/
export PRODUCTION_MASTERS=tools/buildfarm/maintenance/production-masters.json
python buildbot-configs/update-master-json.py $PRODUCTION_MASTERS master/master_config.json
make checkconfig
make reconfig
</pre>
 
= Help, my reconfig failed! =
Assuming you're using the end_to_end_reconfig.sh script, you can resume after fixing the error. Errors and exit state can be found in the manage_masters-##########.log which is created in /tmp/reconfig by default. The hourly auto reconfig logs are in the bb dir as <tt>reconfig.{log,lock}</tt>.
 
Running the script again will yield the following menu:
 
* Please select one of the following options:
1) Continue with existing reconfig (e.g. if you have resolved a merge conflict)
2) Delete saved state for existing reconfig, and start from fresh
3) Abort and exit reconfig process
 
Number 1 is usually the best option here, especially if the hg operations actually succeeded during the previous attempt. This way the wiki/bugzilla updates will still be properly applied.


= Help, my reconfig is stuck! =
= Help, my reconfig is stuck! =
If the reconfig gets stuck, see [https://wiki.mozilla.org/ReleaseEngineering/How_To/Unstick_a_Stuck_Slave_From_A_Master How To/Unstick a Stuck Slave From A Master].
Reconfigs can take as little as 30 minutes to run, but can take up to 2 hours depending on how busy the systems are.
 
In general, linux test masters are the slowest to reconfig. You can tail the manage_masters-##########.log to keep up with progress. By default, this log is created in /tmp/reconfig. The hourly auto reconfig logs are in the bb dir as <tt>reconfig.{log,lock}</tt>.
 
If the reconfig gets stuck, see [[ReleaseEngineering/How_To/Unstick_a_Stuck_Slave_From_A_Master|How To/Unstick a Stuck Slave From A Master]].
Confirmed users
502

edits