ReleaseEngineering/Puppet/Usage
This document is intended to serve as a guide to interacting with our Puppet servers and manifests.
Definitions
- Type - Puppet documentation talks a lot about this. Each different "type" deals with a different aspect of the system. For example, the "user" type can do most things related to user management (passwords, UID/GID, homedirs, shells, etc). The 'package' type deals with package management (eg, apt, rpm, fink, etc). And so on.
Masters
We currently have four masters, with no real rhyme or reason to their hostnames
- staging-puppet.build.mozilla.org (staging, in MPT)
- production-puppet.build.mozilla.org (MPT)
- mv-production-puppet.build.mozilla.org (MV)
- scl-production-puppet.build.scl1.mozilla.com (SCL)
Note that staging-puppet and production-puppet share an NFS-mounted /N, while the other machines use local storage.
The Slave-Master Link
You can find to which puppet master a slave connects to by checking this files' contents:
# for linux testers (fedora) ~cltbld/.config/autostart/gnome-terminal.desktop # for linux builders (centos) /etc/sysconfig/puppet # for osx /Library/LaunchDaemons/com.reductivelabs.puppet.plist
If the slaves have to be moved between masters be sure to remove the certs after you modify this file and before their next reboot. You may also need to do 'puppetca --clean <FQDN>' on the new puppet master.
# for linux rm -rf /var/lib/puppet/ssl # for mac rm -rf /etc/puppet/ssl
Our Puppet Manifests
Out puppet manifests are organized into a few different parts:
- Site files
- Basic includes
- Packages that make changes
Site Files & Basic Includes
Each Puppet master has its own site file which contains a few things:
- Variable definitions specific to that master
- Import statements which load other parts of the manifests
- Node (slave) definitions
The basic includes are located in the 'base' directory. These files set variables with are referenced in the packages as well as base nodes for slaves.
The most important variables to take note of are:
- ${platform_fileroot} -- Used wherever the puppet:// protocol is supported, most notably with the File type.
- ${platform_httproot} -- Used with the Package type and other places that don't support puppet://
There are also ${local_[file,http]} variables which point to the 'local' directory inside of each platform's root. See the following section for more on that.
We have a few base nodes shared by multiple pools of slaves as well as a base node for each concrete slave type. The shared ones are:
- "slave" -- For things common to ALL slaves managed by Puppet
- "build" -- For things common to all build slaves
- "test" -- For things common to all test slaves
There are two different types of concrete nodes. Firstly, we have $platform-$arch-$type" nodes, which are used on all Puppet masters for slaves which are local to them. Two example are: "centos5-i686-build" (32-bit, CentOS 5, build slaves) and "darwin10-i386-test" (32-bit, Mac 10.6, test slaves). Secondly, there are "$location-$type-node" nodes, which only apply to the MPT master. All nodes which are not local to MPT production are listed in its configuration file as this type of node. These nodes ensure that new slaves get redirected to their local master when they first come up. Examples include "mv-build-node" and "staging-test-node".
See base/nodes.pp for the full listing of nodes..
Packages
- The site-{staging,production}.pp files declare the list of slaves and each slave has defined which classes to include.
- The classes buildslave.pp and staging-buildslave.pp include most of the packages (devtools, nagios, mercurial, buildbot, extras, etc) we want.
- The packages can have different sections or "Types" that can be "exec", "user", "package", "file", "service"
Puppet Files
The files that Puppet serves up (using File) are in /N on each puppet master. The MPT masters share this via an NFS mount, so it's easy to sync files from staging to MPT production. The other servers have a local copy of this data.
That first 3 levels of the drive are laid out as follows:
$level/$os-$hardwaremodel/$slaveType
- $level is support level (production, staging, pre-production)
- $os is generally one of 'centos5', 'fedora12', 'darwin9', or 'darwin10'.
- $hardwaremodel is whatever 'facter' identifies the machine's CPU as (x86_64, i686, i386, etc).
- $slaveType is the "type" of node of the slave is: 'build', 'test', 'stage', 'master', etc.
Below '$type', are all of the files served by Puppet. They are organized according to where they'll end up on the slave. For example, if /etc/X11/fonts.conf is to be synced to the slave, it should live in:
etc/X11/fonts.conf
There are two special directories for each level/os/hardwaremodel/type combination, too:
- local -- This directory contains files which should NOT be synced between staging <-> production or between different locations. Files such as the Puppet configs which have different contents depending on location and support level live here.
- DMGs (Mac) / RPMs (Fedora/CentOS) -- These directories contain platform specific packages which Puppet installs.
Common Use Cases
Updating a password
Passwords are stored in a hashed format alongside other user information. We do not put the hashes in a public location for hopefully obvious reasons - please make sure you don't do this by accident.
Let's say you want to update cltbld's password. First, you need to generate the new hash. You can do that by running the following:
makepasswd --clearfrom=- --crypt-md5 # now type the password and hit ^D a couple times
Now, copy and paste that password into /etc/puppet/manifests/build/cltbld.pp as the 'password' for the cltbld user. Do this on all active puppet masters.
Installing a Package
After pushing file deployment over NFS to its limit we replaced it with native package formats for software deployment. This switch was made around June, 2010.
RPM (CentOS, Fedora)
We use a combination of 3rd party and in-house RPMs to deploy to our Linux machines. On the manifest side we use the 'rpm' package provider wrapped in a custom type to ensure installation.
To build or upgrade a homegrown RPM, see ReleaseEngineering/How To/Create a new RPM.
Manifests
The manifests are pretty simple once have an RPM. We use a wrapper type called 'install_rpm' to perform installation. You can use it as follows:
install_rpm {
"gcc433-4.3.3-0moz1":
creates => "/tools/gcc-4.3.3/installed/bin/gcc",
pkgname => "gcc433";
}
The name needs to match the package name + version. Note that RPM requires a 'vendor' version, which is where the 0moz1 comes from. Creates needs to be a file that the package creates, preferably the last one to get installed.
DMG+pkg (Mac)
Mac machines use pkg installers wrapped in a DMG file as a package format. On the manifest side, we use the 'pkgdmg' package provider wrapped in a custom type to deploy them. For things which are distributed in a DMG+pkg (such as Xcode) you can skip down to the manifests.
When an upstream DMG file is not available it needs to be created by hand. To do this, we do a manual installation once and then using a script to create the DMG+pkg. Here's an example, which creates a Python 2.5.2 DMG:
# The installation tar jxvf Python-2.5.2.tar.bz2 cd Python-2.5.2 ./configure --prefix=/tools/python-2.5.2 make make install cd .. # DMG creation hg clone http://hg.mozilla.org/build/puppet-manifests ./puppet-manifests/create-dmg.sh /tools/python-2.5.2 python-2.5.2 python /tools
The first argument to create-dmg.sh is the directory to package, which will include the directory itself. The second argument is the name to use on the DMG/pkg filenames. The third is the string to use in the package identifier, it must be alphanumeric only. Lastly, the directory to install the package to.
On the manifests side of things a simple use of the install_dmg type will ensure a package gets installed:
install_dmg {
"python-2.5.2.dmg":
creates => "/tools/python-2.5.2/share",
}
The argument to "creates" should be one of the last files that will be created by the package. Internally, install_dmg checks for this file and mark the package as installed if it exists, skipping installation.
If you intend to use a package on multiple platforms always ensure to test on them before rolling out any manifest changes. When in doubt, create a package on each target platform.
Testing
Before you test on the Puppet server it's good to run the 'test-manifests.sh' scripts locally. This script will test the syntax of the manifest files and catch very basic issues. It will not catch any issues with run-time code such as Exec's.
Testing of updates is done with staging-puppet.build.mozilla.org and staging slaves. You should book staging-puppet as well as an slaves you intend to test on before making any changes to the manifests on the Puppet server. All Puppet server work is done as the root user.
Setting up the server
If you've never used the Puppet server before you'll want to start a clone of the manifests for yourself. You can clone the main manifests repo or your own user repo to a directory under /etc/puppet. Once you have your clone, two edits are necessary:
- Copy the password hash into your clone's build/cltbld.pp. This can be done with the following command, run from the root of your clone:
hg -R /etc/puppet/manifests.real diff /etc/puppet/manifests.real/build/cltbld.pp | patch -p1
or more easily
patch -p1 < /etc/puppet/password
- Comment out all of the "node" entries in staging.pp, except for those which you have booked.
If you have a patch to apply to the repository now is the time to do it.
Finally, if your changes involve edits to any files served by Puppet, apply those changes in the appropriate places under /N/staging.
Staging environments do not have the site.pp manifest. When testing in a staging environment, symlink site.pp to staging.pp with the following command:
ln -s staging.pp site.pp
Once all of that is done you can swap your manifests in by adjusting the symlink on /etc/puppet/manifests. If you've added new files or changed staging-fileserver.conf you'll need to restart the Puppetmaster process with:
service puppetmaster restart
Now, you're ready to test.
Testing a slave
Puppet needs to run as root on the slaves, so equip yourself thusly and run the following command:
puppetd --test --server staging-puppet.build.mozilla.org --logdest console --noop
This will pull updated manifests from the server, see what needs to be done, and output that. The --noop argument tells Puppet to not make any changes to the slave. Once you're satisfied with the output of that, you can run it without the --noop to have Puppet make the changes. The output should be coloured, and indicate success/fail/exception.
If you're encountering errors or weird behaviour and the normal output isn't sufficient for debugging you can enhance it with --evaltrace and --debug. Together, they will print out every command that Puppet runs, including things which are used to determine whether a file or package needs updating.
Forcing a package re-install
Especially when testing, you may have to iterate on a single package install to get it right. If you need to re-install an existing package, you'll need to remove the package contents and/or the marker file that flags that package as installed.
- Linux: packages installed as rpms should be removed as one normally would for an rpm, i.e.
rpm -e rpmname, which will delete all of the files and remove the package from the db, orrpm -e --justdb rpmname, which will leave all of the files and remove the package from the db - Mac: manually cleanup the installed files, and remove the marker file for your package. The marker file lives under
/var/db/and will be named.puppet_pkgdmg_installed_pkgname.dmg.
You can now re-test your package install with the command above, i.e. puppetd --test ....
Cleaning up
Once you're finished testing the manifests symlink needs to be re-adjusted with:
cd /etc/puppet rm manifests ln -s manifests.real manifests
Moving file updates to production
Production Puppet Masters:
- production-puppet.build.mozilla.org (aka mpt-production-puppet.build.mozilla.org)
- mv-production-puppet.build.mozilla.org
- scl-production-puppet.build.scl1.mozilla.com
NOTE: there are a lot of files that differ between the various directories, so using rsync involves a lot of whack-a-mole to avoid syncing files that aren't part of your change. It may be easier to simply use 'cp' for this step
When you're ready to land in production it's important to sync your files from staging to ensure you don't end up with a different result in production. Here's the process to do that. On production-puppet as root, run:
rsync -n --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/
After verifying that only the things you want are being synced, run it without -n to push them for real:
rsync --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/
If there are things that shouldn't be synced carefully adjust the rsync command with --exclude or more specific paths.
Once you've landed into /N/production on production-puppet, the other production puppet masters need to be updated: In theory, this is done as 'filesync', but that user does not have permission to update the relevant directories, so in practice I suspect it's done as root. Anyway, here's the example:
sudo su - filesync rsync -av --exclude=**/local/etc/sysconfig/puppet* --exclude=**/local/Library/LaunchDaemons/com.reductivelabs.puppet.plist* --exclude=**/local/home/cltbld/.config/autostart/gnome-terminal.desktop* --delete filesync@production-puppet.build.mozilla.org:/N/production/ /N/production/
again, rsync is finicky, so scp may be your friend here:
scp {root@production-puppet.build.mozilla.org:/N/production,/N/production}/darwin9-i386/build/Library/Preferences/com.apple.Bluetooth.plist
When you're ready, update the manifests on the masters with:
hg -R /etc/puppet/manifests pull hg -R /etc/puppet/manifests update
Note that some changes may require manifest updates first - think carefully about the intermediate state and what it will do to slaves!
Be sure to do this on all Puppet masters.
Moving slaves between staging/production
If you need to move slaves between staging and production, you'll need to delete the existing ssl certs on the slave so it can properly sync with the new puppet master. These certs can be found under /etc/puppet/ssl on mac or /var/lib/puppet/ssl on linux.
Documentation/Links
Puppet has reasonably complete documentation, although navigating it can be a challenge.
- Type Reference
- Metaparameter (parameters which apply to all types) Reference
- Puppet Command-line and Configuration Reference
Troubleshooting
Puppet sometimes gets itself into a weird state that needs manual intervention.
Parsing YAML
Could not parse YAML data for node linux-ix-ref.build.mozilla.org: syntax error on line 72, col -1
Try deleting the corresponding file in /var/lib/puppet/yaml/node/