From MozillaWiki
Jump to: navigation, search

A sysadmin asked the Architect,

"What's the best way to install a new system?"

The Architect answered,

"Turn it on."

The sysadmin was enlightened.

Documentation - Manifests & Modules

The Puppet manifests themselves are documented here. Any new modules should be added to the proper list below.


All substantial configuration is done with puppet modules, in the modules directory. Each should have its own page describing both how to use the module, and how the module works, below:


These modules are part of the puppet system itself, and provide support to other modules as needed


These modules actually get stuff done.


These modules are more generic, and probably useful outside of PuppetAgain.


These are modules taken from elsewhere. When adding, remember to verify license compatibility and ensure proper credit.


Bugs for work on PuppetAgain should be filed in the Infrastructure & Operations - Relops: Puppet Component.

How To

System Description

This section describes how PuppetAgain is built at Mozilla. External implementations may not have all of these bells and whistles. This link contains details oriented toward Mozilla IT and ops folks.

The Goals

  • PuppetAgain should be usable as a whole for folks outside of Mozilla, Inc. who want to build similar systems (see "Organizations" below)
  • Client images should proceed automatically from base image install to a fully-operational state. While refimages may be employed, this is done only as an optimization.
  • We do not keep distinct reference images. Reference images are used only as an optimization to avoid pounding the puppet servers when installing dozens of new hosts. When a new refimage snapshot needs to be made, a fresh machine is rebuilt from scratch, snapshotted, and then returned to service.
  • OS does not imply role. Roles are defined in node declarations, by including toplevel::* classes.
  • Include all necessary dependencies. Debugging dependency errors when building a new reference system is no fun.
  • Documentation (here) is a part of the patch.

See ReleaseEngineering/PuppetAgain/HowTo/Hack on PuppetAgain for more detail


Each distinct instance of puppetagain is referred to as an organization, and tagged with a short identifier (e.g., "moco" for the mozilla releng instance, or "seamonkey" for seamonkey). Within an organization, configuration and secrets are shared, and everything runs from the same set of manifests. Configuration and secrets can differ between organizations.


PuppetAgain masters are managed by PuppetAgain. Each organization can have 1 or more masters, arranged in a cluster (with one cluster per organization). There is one "distinguished master" in the cluster. This master is distinguished only for purposes of simplifying synchronization -- the cluster will continue to operate indefinitely without the distinguished master, although master-master communication (secrets and CRLs) will not work.

See the following for more details, noting that most of this is not required for an external PuppetAgain implementation.

Puppet Versions

The releng puppet infrastructure will strive to keep up to date with the most recent stable versions released by Puppet Labs.

Base Images and Puppetizing

The base images for this infrastructure are barely-modified OS installs. They have just enough installed that they can connect to a puppet server, get certificates, and puppetize on boot.

Note that, while most of PuppetAgain is intended to be easily replicated, the deployment system is probably not easily replicated, and is best left out of any external implementations.

Custom Facts, Functions, Types, and Providers

Custom code is documented in the page for the module that contains it. Code that doesn't have a more appropriate home is in shared.


Stages need to be defined globally in Puppet manifests, and this is done in manifests/stages.pp. The following stages are available, aside from 'main', the default stage.

  • network - This stage should handle any network related configurations for some specific cases (like AWS)
  • packagesetup - This stage should handle any preliminaries required for package installations, so that subsequent package installations do not need to require them explicitly.
  • users - This stage creates user accounts; while this is normally automatically required, the requirement doesn't work with the temporary 'darwinuser' type.


manifests/nodes.pp defines all of the nodes the puppet masters recognize. Note that all nodes are defined for all masters. This file is a symlink to $org-nodes.pp, e.g., moco-nodes.pp. With this arrangement, each organization can make node changes without any risk to other organizations.

In anticipation of using an external node classifier (ENC), node definitions should only include classes - do not define any resources within nodes. In general, the included classes should be in the toplevel module.

Host-specific values are specified as node-scope variables, as these are easier to represent in an ENC. Such variables (including some Puppet gotchas) are described in node-scope variables.

Node definitions also specify a host's aspects, e.g., $aspects = [ 'staging' ].


Per-organization configuration is read from manifests/config.pp, which is a symlink to $org-config.pp, similar to that for nodes. The config.pp file defines a "config" class that inherits from "config::base". It is free to express the configuration using any mechanism available to puppet. For some organizations, simple puppet literals will do, while more complex organizations will want to perform some more sophisticated automatic generation of configuration. See config for more.

Secrets and External Data

See ReleaseEngineering/PuppetAgain/Secrets and ReleaseEngineering/PuppetAgain/Extsync.


Puppet deals with a lot of big files - packages, mostly. We don't want these in hg! They are instead managed as data. This means several big file trees available at http://repos/$treename and, from puppet, at puppet://$treename. See ReleaseEngineering/PuppetAgain/Data for details on what's available, how it is implemented, and some how-tos.

This data is available outside of Mozilla via HTTP and rsync at and rsync://


See ReleaseEngineering/PuppetAgain/Packages for information about proper handling of packages in PuppetAgain.


Taking a page from Aspect Oriented Programming, PuppetAgain implements Aspect Oriented Puppet. Aspects cross-cut the concerns represented by the toplevel hierarchy. For example, whether a host is a staging host, whether it is loaned out, etc. See ReleaseEngineering/PuppetAgain/Aspects for details.

Source Code

The manifests are at


Releng once used a puppet infrastructure based on Puppet-0.24.8, and manifests at This had a few weaknesses:

  • lots of assumptions and fragile dependencies based on bugs in 0.24.8
  • very few modules - mostly manifest files, organized per slave type, rather than per service/purpose
  • many references to external files which are not as available as the repo itself
  • puppet manifests assume some manual ref-image steps; external exact reproduction is extremely difficult

Dustin started work on a new puppet deployment - chronicled at User:Djmitche/New Releng Puppet Infrastructure. That's this puppet.

Training notes