ReleaseEngineering/Puppet/Usage

From MozillaWiki
Jump to navigation Jump to search
Warning signWarning: This page documents (mostly) the old release engineering puppet deployment. See ReleaseEngineering/PuppetAgain for documentation of the current deployment.
Puppet: Usage | Server Setup | Client Setup | Links | Troubleshooting

This document is intended to serve as a guide to interacting with our Puppet servers and manifests.

Definitions

  • Type - Puppet documentation talks a lot about this. Each different "type" deals with a different aspect of the system. For example, the "user" type can do most things related to user management (passwords, UID/GID, homedirs, shells, etc). The 'package' type deals with package management (eg, apt, rpm, fink, etc). And so on.

Masters

We currently have four masters, with no real rhyme or reason to their hostnames

  • staging-puppet.build.mozilla.org (staging, in MPT)
  • production-puppet.build.mozilla.org (MPT)
  • mv-production-puppet.build.mozilla.org (MV)
  • scl-production-puppet.build.scl1.mozilla.com (SCL)

Note that staging-puppet and production-puppet share an NFS-mounted /N, while the other machines use local storage.

The Slave-Master Link

You can find to which puppet master a slave connects to by checking this files' contents:

# for linux testers (fedora)
~cltbld/.config/autostart/gnome-terminal.desktop
# for linux builders (centos)
/etc/sysconfig/puppet
# for osx
/Library/LaunchDaemons/com.reductivelabs.puppet.plist

If the slaves have to be moved between masters be sure to remove the certs after you modify this file and before their next reboot. You may also need to do 'puppetca --clean <FQDN>' on the new puppet master.

# for linux
rm -rf /var/lib/puppet/ssl
# for mac
rm -rf /etc/puppet/ssl

Our Puppet Manifests

Out puppet manifests are organized into a few different parts:

  • Site files
  • Basic includes
  • Packages that make changes

Site Files & Basic Includes

Each Puppet master has its own site file which contains a few things:

  • Variable definitions specific to that master
  • Import statements which load other parts of the manifests
  • Node (slave) definitions

The basic includes are located in the 'base' directory. These files set variables with are referenced in the packages as well as base nodes for slaves.

The most important variables to take note of are:

  • ${platform_fileroot} -- Used wherever the puppet:// protocol is supported, most notably with the File type.
  • ${platform_httproot} -- Used with the Package type and other places that don't support puppet://

There are also ${local_[file,http]} variables which point to the 'local' directory inside of each platform's root. See the following section for more on that.

We have a few base nodes shared by multiple pools of slaves as well as a base node for each concrete slave type. The shared ones are:

  • "slave" -- For things common to ALL slaves managed by Puppet
  • "build" -- For things common to all build slaves
  • "test" -- For things common to all test slaves

There are two different types of concrete nodes. Firstly, we have $platform-$arch-$type" nodes, which are used on all Puppet masters for slaves which are local to them. Two example are: "centos5-i686-build" (32-bit, CentOS 5, build slaves) and "darwin10-i386-test" (32-bit, Mac 10.6, test slaves). Secondly, there are "$location-$type-node" nodes, which only apply to the MPT master. All nodes which are not local to MPT production are listed in its configuration file as this type of node. These nodes ensure that new slaves get redirected to their local master when they first come up. Examples include "mv-build-node" and "staging-test-node".

See base/nodes.pp for the full listing of nodes..

Packages

  • The site-{staging,production}.pp files declare the list of slaves and each slave has defined which classes to include.
  • The classes buildslave.pp and staging-buildslave.pp include most of the packages (devtools, nagios, mercurial, buildbot, extras, etc) we want.
  • The packages can have different sections or "Types" that can be "exec", "user", "package", "file", "service"

Puppet Files

The files that Puppet serves up (using File) are in /N on each puppet master. The MPT masters share this via an NFS mount, so it's easy to sync files from staging to MPT production. The other servers have a local copy of this data.

That first 3 levels of the drive are laid out as follows:

$level/$os-$hardwaremodel/$slaveType
  • $level is support level (production, staging, pre-production)
  • $os is generally one of 'centos5', 'fedora12', 'darwin9', or 'darwin10'.
  • $hardwaremodel is whatever 'facter' identifies the machine's CPU as (x86_64, i686, i386, etc).
  • $slaveType is the "type" of node of the slave is: 'build', 'test', 'stage', 'master', etc.

Below '$type', are all of the files served by Puppet. They are organized according to where they'll end up on the slave. For example, if /etc/X11/fonts.conf is to be synced to the slave, it should live in:

etc/X11/fonts.conf

There are two special directories for each level/os/hardwaremodel/type combination, too:

  • local -- This directory contains files which should NOT be synced between staging <-> production or between different locations. Files such as the Puppet configs which have different contents depending on location and support level live here.
  • DMGs (Mac) / RPMs (Fedora/CentOS) -- These directories contain platform specific packages which Puppet installs.

Common Use Cases

Updating a password

Passwords are stored in a hashed format alongside other user information. We do not put the hashes in a public location for hopefully obvious reasons - please make sure you don't do this by accident.

Let's say you want to update cltbld's password. First, you need to generate the new hash. You can do that by running the following:

makepasswd --clearfrom=- --crypt-md5
# now type the password and hit ^D a couple times

Now, copy and paste that password into /etc/puppet/manifests/build/cltbld.pp as the 'password' for the cltbld user. Do this on all active puppet masters.

Installing a Package

After pushing file deployment over NFS to its limit we replaced it with native package formats for software deployment. This switch was made around June, 2010.

RPM (CentOS, Fedora)

We use a combination of 3rd party and in-house RPMs to deploy to our Linux machines. On the manifest side we use the 'rpm' package provider wrapped in a custom type to ensure installation.

To build or upgrade a homegrown RPM, see ReleaseEngineering/How To/Create a new RPM.

Manifests

The manifests are pretty simple once have an RPM. We use a wrapper type called 'install_rpm' to perform installation. You can use it as follows:

install_rpm {
    "gcc433-4.3.3-0moz1":
        creates => "/tools/gcc-4.3.3/installed/bin/gcc",
        pkgname => "gcc433";
}

The name needs to match the package name + version. Note that RPM requires a 'vendor' version, which is where the 0moz1 comes from. Creates needs to be a file that the package creates, preferably the last one to get installed.

DMG+pkg (Mac)

Mac machines use pkg installers wrapped in a DMG file as a package format. On the manifest side, we use the 'pkgdmg' package provider wrapped in a custom type to deploy them. For things which are distributed in a DMG+pkg (such as Xcode) you can skip down to the manifests.

When an upstream DMG file is not available it needs to be created by hand. To do this, we do a manual installation once and then using a script to create the DMG+pkg. Here's an example, which creates a Python 2.5.2 DMG:

# The installation
tar jxvf Python-2.5.2.tar.bz2
cd Python-2.5.2
./configure --prefix=/tools/python-2.5.2
make
make install
cd ..
# DMG creation
hg clone http://hg.mozilla.org/build/puppet-manifests
./puppet-manifests/create-dmg.sh /tools/python-2.5.2 python-2.5.2 python /tools

The first argument to create-dmg.sh is the directory to package, which will include the directory itself. The second argument is the name to use on the DMG/pkg filenames. The third is the string to use in the package identifier, it must be alphanumeric only. Lastly, the directory to install the package to.

On the manifests side of things a simple use of the install_dmg type will ensure a package gets installed:

install_dmg {
    "python-2.5.2.dmg":
        creates => "/tools/python-2.5.2/share",
}

The argument to "creates" should be one of the last files that will be created by the package. Internally, install_dmg checks for this file and mark the package as installed if it exists, skipping installation.

If you intend to use a package on multiple platforms always ensure to test on them before rolling out any manifest changes. When in doubt, create a package on each target platform.

Testing

Before you test on the Puppet server it's good to run the 'test-manifests.sh' scripts locally. This script will test the syntax of the manifest files and catch very basic issues. It will not catch any issues with run-time code such as Exec's.

Testing of updates is done with staging-puppet.build.mozilla.org and staging slaves. You should book staging-puppet as well as an slaves you intend to test on before making any changes to the manifests on the Puppet server. All Puppet server work is done as the root user.

Setting up the server

If you've never used the Puppet server before you'll want to start a clone of the manifests for yourself. You can clone the main manifests repo or your own user repo to a directory under /etc/puppet. Once you have your clone, two edits are necessary:

  • Copy the password hash into your clone's build/cltbld.pp. This can be done with the following command, run from the root of your clone:
hg -R /etc/puppet/manifests.real diff /etc/puppet/manifests.real/build/cltbld.pp | patch -p1

or more easily

patch -p1 < /etc/puppet/password
  • Comment out all of the "node" entries in staging.pp, except for those which you have booked.

If you have a patch to apply to the repository now is the time to do it.

Finally, if your changes involve edits to any files served by Puppet, apply those changes in the appropriate places under /N/staging.

Staging environments do not have the site.pp manifest. When testing in a staging environment, symlink site.pp to staging.pp with the following command:

ln -s staging.pp site.pp

Once all of that is done you can swap your manifests in by adjusting the symlink on /etc/puppet/manifests. If you've added new files or changed staging-fileserver.conf you'll need to restart the Puppetmaster process with:

service puppetmaster restart

Now, you're ready to test.

Testing a slave

Puppet needs to run as root on the slaves, so equip yourself thusly and run the following command:

puppetd --test --server staging-puppet.build.mozilla.org --logdest console --noop

This will pull updated manifests from the server, see what needs to be done, and output that. The --noop argument tells Puppet to not make any changes to the slave. Once you're satisfied with the output of that, you can run it without the --noop to have Puppet make the changes. The output should be coloured, and indicate success/fail/exception.

If you're encountering errors or weird behaviour and the normal output isn't sufficient for debugging you can enhance it with --evaltrace and --debug. Together, they will print out every command that Puppet runs, including things which are used to determine whether a file or package needs updating.

Forcing a package re-install

Especially when testing, you may have to iterate on a single package install to get it right. If you need to re-install an existing package, you'll need to remove the package contents and/or the marker file that flags that package as installed.

  • Linux: packages installed as rpms should be removed as one normally would for an rpm, i.e. rpm -e rpmname, which will delete all of the files and remove the package from the db, or rpm -e --justdb rpmname, which will leave all of the files and remove the package from the db
  • Mac: manually cleanup the installed files, and remove the marker file for your package. The marker file lives under /var/db/ and will be named .puppet_pkgdmg_installed_pkgname.dmg.

You can now re-test your package install with the command above, i.e. puppetd --test ....

Cleaning up

Once you're finished testing the manifests symlink needs to be re-adjusted with:

cd /etc/puppet
rm manifests
ln -s manifests.real manifests

Moving file updates to production

Production Puppet Masters:

  • production-puppet.build.mozilla.org (aka mpt-production-puppet.build.mozilla.org)
  • mv-production-puppet.build.mozilla.org
  • scl-production-puppet.build.scl1.mozilla.com

NOTE: there are a lot of files that differ between the various directories, so using rsync involves a lot of whack-a-mole to avoid syncing files that aren't part of your change. It may be easier to simply use 'cp' for this step

When you're ready to land in production it's important to sync your files from staging to ensure you don't end up with a different result in production. Here's the process to do that. On production-puppet as root, run:

rsync -n --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/

After verifying that only the things you want are being synced, run it without -n to push them for real:

rsync --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/

If there are things that shouldn't be synced carefully adjust the rsync command with --exclude or more specific paths.

Once you've landed into /N/production on production-puppet, the other production puppet masters need to be updated: In theory, this is done as 'filesync', but that user does not have permission to update the relevant directories, so in practice I suspect it's done as root. Anyway, here's the example:

sudo su - filesync
rsync -av --exclude=**/local/etc/sysconfig/puppet* --exclude=**/local/Library/LaunchDaemons/com.reductivelabs.puppet.plist* --exclude=**/local/home/cltbld/.config/autostart/gnome-terminal.desktop* --delete  filesync@production-puppet.build.mozilla.org:/N/production/ /N/production/

again, rsync is finicky, so scp may be your friend here:

 scp {root@production-puppet.build.mozilla.org:/N/production,/N/production}/darwin9-i386/build/Library/Preferences/com.apple.Bluetooth.plist

When you're ready, update the manifests on the masters with:

hg -R /etc/puppet/manifests pull
hg -R /etc/puppet/manifests update

Note that some changes may require manifest updates first - think carefully about the intermediate state and what it will do to slaves!

Be sure to do this on all Puppet masters.

Moving slaves between staging/production

If you need to move slaves between staging and production, you'll need to delete the existing ssl certs on the slave so it can properly sync with the new puppet master. These certs can be found under /etc/puppet/ssl on mac or /var/lib/puppet/ssl on linux.

Documentation/Links

Puppet has reasonably complete documentation, although navigating it can be a challenge.

Troubleshooting

Puppet sometimes gets itself into a weird state that needs manual intervention.

Parsing YAML

Could not parse YAML data for node linux-ix-ref.build.mozilla.org: syntax error on line 72, col -1

Try deleting the corresponding file in /var/lib/puppet/yaml/node/