ReleaseEngineering/How To/Setup a buildbot master: Difference between revisions
No edit summary |
|||
Line 92: | Line 92: | ||
* Copy production ssh keys (for ffxbld, trybld, xrbld and tbirdbld) and <tt>known_hosts</tt> to ~/.ssh. | * Copy production ssh keys (for ffxbld, trybld, xrbld and tbirdbld) and <tt>known_hosts</tt> to ~/.ssh. | ||
* Verify as described below | * Verify as described below | ||
* Add the master's ssh key to known_hosts (to make release-runner work): | |||
** dump the key by running the following: | |||
ssh-keyscan $master_name | |||
** ad it to /N/production/home/cltbld/.ssh/known_hosts on master-puppet1.build.scl1.mozilla.com | |||
** add it to bm36:~cltbld/.ssh/known_hosts (puppet doesn't work here) | |||
** TODO: add steps for puppetagain | |||
** verify the change (ssh to masters as cltbld from bm36) | |||
== Final Verification == | == Final Verification == |
Revision as of 00:04, 10 April 2013
This page describes how to set-up a new Buildbot Master.
AWS Masters
AWS master setup is covered by ReleaseEngineering/AWS_Master_Setup
Production masters
For buildbot masters that are intended to be doing production builds, tests, etc.
Hardware
- Current policy is one buildbot master instance per VM
- 64-bit guest
- 2 virtual CPUs
- 6 GB RAM
- 6 GB swap
- 30GB partition mounted at /
- 100MB partition mounted at /boot
OS
- Install CentOS 5.5
- Make sure hostname is correct is set to the fully qualified name. This can be modified in /etc/sysconfig/network and the hostname command.
- Install puppet
rpm -Uvh http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm yum install puppet
- (optional - recommended if this isn't "just another box install" for any reason) Run puppetd manually until no further work is required. (If you don't have/need your own puppent environment on the puppet master, just remove that option below)
# get the hostname & key issues out of the way - repeat until no more # "warning: peer certificate won't be verified in this SSL session" messages. # (you'll need to do the master-puppet1 signing listed below) puppetd --test --server master-puppet1.build.scl1.mozilla.com --environment YOUR_ENV --noop # now work on actual config issues # start in directory accessible to all users (not '/root') cd /tmp puppetd --test --server master-puppet1.build.scl1.mozilla.com --environment YOUR_ENV # repeat above until you get no actions taken or error reported
- Configure daemon to point to master-puppet1.build.mozilla.org (regardless of datacenter master resides in)
vim /etc/sysconfig/puppet PUPPET_SERVER=master-puppet1.build.scl1.mozilla.com
- Run these:
chkconfig puppet on /etc/init.d/puppet start
# On master-puppet1: puppetca --sign your-new-master.build.scl1.mozilla.com
Support files, Wikis
Update production-masters.json in tools.
Puppet manifests
- Make sure your masters are listed in buildmaster-production.pp
When you're ready, update the manifests on the master with:
hg -R /etc/puppet/manifests pull hg -R /etc/puppet/manifests update
Once the manifests are updated the masters' build dirs should be automatically created.
- For build masters, add master's ip to secrets::network::masterIPs on master-puppet1:/etc/puppet/manifests/secrets.pp. The signing instances will need to be reloaded. See [1]
Add masters to slavealloc
See Adding your master to slavealloc
Lock a slave and let it take jobs
Locked through slavealloc a slave to the newly setup master. Let it run for a couple of hours and check that the jobs worked well. Check also in the #buildduty channel for possible nagios checks going off (e.g. Queue directories checks)
If no issues are found then go ahead and enable the master on slavealloc.
File separate bugs for Nagios (eg: bug 717804), Mysql access (eg: bug 717806)
- Nagios
- PING
- Swap
- avg load
- buildbot
- disk - /
- disk - /builds
- Mysql access to the DB server
- Verify that master can send mail to tinderbox via dm-mail01. see e.g. bug 717808
SSH Keys
- Copy production ssh keys (for ffxbld, trybld, xrbld and tbirdbld) and known_hosts to ~/.ssh.
- Verify as described below
- Add the master's ssh key to known_hosts (to make release-runner work):
- dump the key by running the following:
ssh-keyscan $master_name
- ad it to /N/production/home/cltbld/.ssh/known_hosts on master-puppet1.build.scl1.mozilla.com
- add it to bm36:~cltbld/.ssh/known_hosts (puppet doesn't work here)
- TODO: add steps for puppetagain
- verify the change (ssh to masters as cltbld from bm36)
Final Verification
We don't "burn in" buildbot masters - they will go directly to their assigned roles. The following steps should be performed to ensure the rest has worked okay(all steps should be run as user cltbld):
- SSH verification (and associated netflows).
$ for h in ffxbld trybld xrbld tbirdbld; do ssh -i ~/.ssh/${h}_dsa $h@stage.mozilla.org id ; done $ for h in ffxbld; do ssh -i ~/.ssh/${h}_dsa $h@pvtbuilds2.dmz.scl3.mozilla.com id ; done
- mySQL verification (and associated netflows):
$ mysql -h buildbot-ro-vip.db.scl3.mozilla.com ERROR 1045 (28000): Access denied for user 'cltbld'@'10.22.70.209' (using password: NO) $ mysql -h buildbot-rw-vip.db.scl3.mozilla.com ERROR 1045 (28000): Access denied for user 'cltbld'@'10.22.70.209' (using password: NO)
- puppet verification (check server, as well as running status)
[cltbld@buildbot-master35 ~]$ /sbin/chkconfig --list puppet puppet 0:off 1:off 2:on 3:on 4:on 5:on 6:off [cltbld@buildbot-master35 ~]$ ps $(pgrep puppet) PID TTY STAT TIME COMMAND 20751 ? Ssl 0:02 /usr/bin/ruby /usr/sbin/puppetd --server=master-puppet1.build.scl1.mozilla [cltbld@buildbot-master35 ~]$
- ensure nagios checks are all green and notifications are enabled (aka not disabled), eg
http://nagios1.private.releng.scl3.mozilla.com/releng-scl3/cgi-bin/status.cgi?navbarsearch=1&host=buildbot-master43
Personal / development masters
See ReleaseEngineering/How To/Setup Personal Development Master