ReleaseEngineering/How To/Setup a buildbot master: Difference between revisions
ChrisCooper (talk | contribs) m (ChrisCooper moved page ReleaseEngineering/Master Setup to ReleaseEngineering/How To/Setup a buildbot master: Striving for some consistency in HowTo docs) |
ChrisCooper (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
{{Release Engineering How To| | {{Release Engineering How To|Setup a buildbot master}} | ||
This page describes how to set-up a new Buildbot Master. | This page describes how to set-up a new Buildbot Master. | ||
=AWS Masters = | =AWS Masters = |
Revision as of 22:22, 16 June 2015
This page describes how to set-up a new Buildbot Master.
AWS Masters
AWS master setup is covered by ReleaseEngineering/AWS_Master_Setup
Production masters
For buildbot masters that are intended to be doing production builds, tests, etc.
Hardware
- Current policy is one buildbot master instance per VM
- 64-bit guest
- 2 virtual CPUs
- 6 GB RAM
- 6 GB swap
- 30GB partition mounted at /
- 100MB partition mounted at /boot
OS
- Install CentOS 5.5
- Make sure hostname is correct is set to the fully qualified name. This can be modified in /etc/sysconfig/network and the hostname command.
- Install puppet
rpm -Uvh http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm yum install puppet
- (optional - recommended if this isn't "just another box install" for any reason) Run puppetd manually until no further work is required. (If you don't have/need your own puppent environment on the puppet master, just remove that option below)
# get the hostname & key issues out of the way - repeat until no more # "warning: peer certificate won't be verified in this SSL session" messages. # (you'll need to do the master-puppet1 signing listed below) puppetd --test --server master-puppet1.build.scl1.mozilla.com --environment YOUR_ENV --noop # now work on actual config issues # start in directory accessible to all users (not '/root') cd /tmp puppetd --test --server master-puppet1.build.scl1.mozilla.com --environment YOUR_ENV # repeat above until you get no actions taken or error reported
- Configure daemon to point to master-puppet1.build.mozilla.org (regardless of datacenter master resides in)
vim /etc/sysconfig/puppet PUPPET_SERVER=master-puppet1.build.scl1.mozilla.com
- Run these:
chkconfig puppet on /etc/init.d/puppet start
# On master-puppet1: puppetca --sign your-new-master.build.scl1.mozilla.com
Support files, Wikis
Update production-masters.json in tools.
Puppet manifests
- Make sure your masters are listed in buildmaster-production.pp
When you're ready, update the manifests on the master with:
hg -R /etc/puppet/manifests pull hg -R /etc/puppet/manifests update
Once the manifests are updated the masters' build dirs should be automatically created.
- For build masters, add master's ip to secrets::network::masterIPs on master-puppet1:/etc/puppet/manifests/secrets.pp. The signing instances will need to be reloaded. See [1]
Add masters to slavealloc
See Adding your master to slavealloc
Follow the steps for AWS masters.
SSH Keys
- Copy production ssh keys (for ffxbld, trybld, xrbld and tbirdbld) and known_hosts to ~/.ssh.
- Verify as described below
- Add the master's ssh key to known_hosts (to make release-runner work):
- dump the key by running the following:
ssh-keyscan $master_name
- ad it to /N/production/home/cltbld/.ssh/known_hosts on master-puppet1.build.scl1.mozilla.com
- add it to bm36:~cltbld/.ssh/known_hosts (puppet doesn't work here)
- TODO: add steps for puppetagain
- verify the change the same way as for AWS masters.
Lock a slave and let it take jobs
Locked through slavealloc a slave to the newly setup master. Let it run for a couple of hours and check that the jobs worked well. Check also in the #buildduty channel for possible nagios checks going off (e.g. Queue directories checks)
If no issues are found then go ahead and enable the master on slavealloc.
Final Verification
We don't "burn in" buildbot masters - they will go directly to their assigned roles. The following steps should be performed to ensure the rest has worked okay(all steps should be run as user cltbld):
- SSH verification (and associated netflows).
$ for h in ffxbld trybld xrbld tbirdbld; do ssh -i ~/.ssh/${h}_dsa $h@stage.mozilla.org id ; done $ for h in ffxbld; do ssh -i ~/.ssh/${h}_dsa $h@pvtbuilds2.dmz.scl3.mozilla.com id ; done
- mySQL verification (and associated netflows):
$ mysql -h buildbot-ro-vip.db.scl3.mozilla.com ERROR 1045 (28000): Access denied for user 'cltbld'@'10.22.70.209' (using password: NO) $ mysql -h buildbot-rw-vip.db.scl3.mozilla.com ERROR 1045 (28000): Access denied for user 'cltbld'@'10.22.70.209' (using password: NO)
- puppet verification (check server, as well as running status)
[cltbld@buildbot-master35 ~]$ /sbin/chkconfig --list puppet puppet 0:off 1:off 2:on 3:on 4:on 5:on 6:off [cltbld@buildbot-master35 ~]$ ps $(pgrep puppet) PID TTY STAT TIME COMMAND 20751 ? Ssl 0:02 /usr/bin/ruby /usr/sbin/puppetd --server=master-puppet1.build.scl1.mozilla [cltbld@buildbot-master35 ~]$
- ensure nagios checks are all green and notifications are enabled (aka not disabled), eg
http://nagios1.private.releng.scl3.mozilla.com/releng-scl3/cgi-bin/status.cgi?navbarsearch=1&host=buildbot-master43
Personal / development masters
See ReleaseEngineering/How To/Setup Personal Development Master