A sysadmin asked the Architect,

"What's the best way to install a new system?"

The Architect answered,

"Turn it on."

The sysadmin was enlightened.

Documentation - Manifests & Modules

The Puppet manifests themselves are documented here. Any new modules should be added to the proper list below.

Modules

All substantial configuration is done with puppet modules, in the modules directory. Each should have its own page describing both how to use the module, and how the module works, below:

Infrastructure

These modules are part of the puppet system itself, and provide support to other modules as needed

ReleaseEngineering/PuppetAgain/Modules/config - global configuration
ReleaseEngineering/PuppetAgain/Modules/dirs - Common directories
ReleaseEngineering/PuppetAgain/Modules/packages - install packages generically
ReleaseEngineering/PuppetAgain/Modules/puppet - install, upgrade, and run puppet (including custom facts, etc.)
ReleaseEngineering/PuppetAgain/Modules/toplevel - top-level classes for node types, included from node definitions
ReleaseEngineering/PuppetAgain/Modules/shared - shared calculated values, functions, facts, etc.
ReleaseEngineering/PuppetAgain/Modules/users - user account management
ReleaseEngineering/PuppetAgain/Modules/puppetmaster - install, upgrade and run puppet master
ReleaseEngineering/PuppetAgain/Modules/security - host security levels

Action

These modules actually get stuff done.

ReleaseEngineering/PuppetAgain/Modules/androidemulator - install and configure Android emulators
ReleaseEngineering/PuppetAgain/Modules/auditd - install and configure auditd
ReleaseEngineering/PuppetAgain/Modules/aws - manage instance storage
ReleaseEngineering/PuppetAgain/Modules/aws_manager - install and manage AWS related management scripts
ReleaseEngineering/PuppetAgain/Modules/b2g_bumper - install and configure the b2g_bumper service
ReleaseEngineering/PuppetAgain/Modules/bmm - configure all the components of a Mozpool imaging server
ReleaseEngineering/PuppetAgain/Modules/bors - bors installation
ReleaseEngineering/PuppetAgain/Modules/bouncer_check - create a python virtualenv and install and configure the check_bouncer nagios check
ReleaseEngineering/PuppetAgain/Modules/buildslave - buildslave (buildbot) installation and startup
ReleaseEngineering/PuppetAgain/Modules/buildmaster - buildmaster (buildbot) installation and startup
ReleaseEngineering/PuppetAgain/Modules/ccache - ccache directory management
ReleaseEngineering/PuppetAgain/Modules/clean - cleanup tasks
ReleaseEngineering/PuppetAgain/Modules/cleanslate - install cleanslate into a python virtualenv
ReleaseEngineering/PuppetAgain/Modules/collectd - configure collectd
ReleaseEngineering/PuppetAgain/Modules/cron - install and start the cron daemon
ReleaseEngineering/PuppetAgain/Modules/disableservices - disable unneeded services
ReleaseEngineering/PuppetAgain/Modules/dnsmasq - install and start dnsmasq
ReleaseEngineering/PuppetAgain/Modules/firewall - IPTables Firewall for Linux
ReleaseEngineering/PuppetAgain/Modules/foopy - build foopies
ReleaseEngineering/PuppetAgain/Modules/fw - wrapper module for host firewall configuration
ReleaseEngineering/PuppetAgain/Modules/gaia_bumper - bump gaia (nicely, of course)
ReleaseEngineering/PuppetAgain/Modules/ganglia - configure ganglia
ReleaseEngineering/PuppetAgain/Modules/generic_worker - install and configure generic_worker
ReleaseEngineering/PuppetAgain/Modules/git - exec to clone specified git repos
ReleaseEngineering/PuppetAgain/Modules/grub - configure grub for linux hosts
ReleaseEngineering/PuppetAgain/Modules/gui - configure a GUI environment
ReleaseEngineering/PuppetAgain/Modules/hardware - hardware-specific stuff
ReleaseEngineering/PuppetAgain/Modules/httpd - install and configure httpd server
ReleaseEngineering/PuppetAgain/Modules/instance_metadata - obtain instance metadata on AWS hosts and dump it into a file
ReleaseEngineering/PuppetAgain/Modules/jacuzzi_metadata - obtain jacuzzi metadata on AWS hosts and dump it into a file
ReleaseEngineering/PuppetAgain/Modules/log_aggregator - configured centralized logging
ReleaseEngineering/PuppetAgain/Modules/mercurial - manage hg repositories
ReleaseEngineering/PuppetAgain/Modules/mig - install and configure mig_agent
ReleaseEngineering/PuppetAgain/Modules/mockbuild - manage mock build environments
ReleaseEngineering/PuppetAgain/Modules/mozpool - configure all the components of a Mozpool server
ReleaseEngineering/PuppetAgain/Modules/needs_reboot - handle reasons that a system might need to be rebooted
ReleaseEngineering/PuppetAgain/Modules/network - configure host networking parameters
ReleaseEngineering/PuppetAgain/Modules/nginx - install nginx
ReleaseEngineering/PuppetAgain/Modules/nrpe - NRPE support
ReleaseEngineering/PuppetAgain/Modules/ntp - NTP support
ReleaseEngineering/PuppetAgain/Modules/pf - PacketFilter (Firewall) for OSX
ReleaseEngineering/PuppetAgain/Modules/pkgbuilder - set up a host to build OS packages
ReleaseEngineering/PuppetAgain/Modules/powermanagement - configure power management
ReleaseEngineering/PuppetAgain/Modules/powershell -
ReleaseEngineering/PuppetAgain/Modules/proxxy - install and configure nginx to act as a reverse proxy
ReleaseEngineering/PuppetAgain/Modules/rdp - enable windows RDP
ReleaseEngineering/PuppetAgain/Modules/releaserunner - install release runner
ReleaseEngineering/PuppetAgain/Modules/rsyslog - rsyslog configuration
ReleaseEngineering/PuppetAgain/Modules/runner - install runner and manage pre-flight tasks
ReleaseEngineering/PuppetAgain/Modules/screenresolution - set GUI screen resolution
ReleaseEngineering/PuppetAgain/Modules/selfserve_agent - install the BuildAPI self-serve agent
ReleaseEngineering/PuppetAgain/Modules/shipit_notifier - install and configure shipit_notifier in a python virtualenv
ReleaseEngineering/PuppetAgain/Modules/signingserver - configure a signing server instance
ReleaseEngineering/PuppetAgain/Modules/signingworker - configure a signing worker instance
ReleaseEngineering/PuppetAgain/Modules/slaveapi - configure a slaveapi server instance
ReleaseEngineering/PuppetAgain/Modules/slaverebooter - install and configure slaverebooter
ReleaseEngineering/PuppetAgain/Modules/slave_secrets - add secrets to slaves
ReleaseEngineering/PuppetAgain/Modules/smarthost - configure a mail relay
ReleaseEngineering/PuppetAgain/Modules/ssh - manage ssh configuration (server, global, and user)
ReleaseEngineering/PuppetAgain/Modules/talos - talos slave specific settings
ReleaseEngineering/PuppetAgain/Modules/tftpd - tftpd (and xinetd) configuration
ReleaseEngineering/PuppetAgain/Modules/timezone - set the system timezone
ReleaseEngineering/PuppetAgain/Modules/tweaks - small, one-off classes (aka "miscellaneous")
ReleaseEngineering/PuppetAgain/Modules/vnc - configure the VNC server
ReleaseEngineering/PuppetAgain/Modules/web_proxy - configure the system to use a proxy to access the web

Utility

These modules are more generic, and probably useful outside of PuppetAgain.

ReleaseEngineering/PuppetAgain/Modules/kernelmodule - install a Linux kernel module
ReleaseEngineering/PuppetAgain/Modules/motd - edit the contents of /etc/motd
ReleaseEngineering/PuppetAgain/Modules/osxutils - utility for reading and writing configuration using defaults and system setup on MacOSx
ReleaseEngineering/PuppetAgain/Modules/python - support for installing python virtualenvs and packages
ReleaseEngineering/PuppetAgain/Modules/shellprofile - add items to users' default shell profile
ReleaseEngineering/PuppetAgain/Modules/sudoers - manage sudo permissions
ReleaseEngineering/PuppetAgain/Modules/supervisord - supervise processes

Third-Party

These are modules taken from elsewhere. When adding, remember to verify license compatibility and ensure proper credit.

assert - from https://github.com/binford2k/puppet-assert
sysctl - from https://github.com/duritong/puppet-sysctl
concat - from https://github.com/ripienaar/puppet-concat (modified to not use a fact, although this should probably be reverted)
firewall - from https://github.com/puppetlabs/puppetlabs-firewall/
stdlib - from https://github.com/puppetlabs/puppetlabs-stdlib/
vmwaretools - from https://github.com/craigwatson/puppet-vmwaretools
Windows Firewall - from https://forge.puppetlabs.com/liamjbennett/windows_firewall
Windows Registry - from https://forge.puppetlabs.com/puppetlabs/registry

Bugs

Bugs for work on PuppetAgain should be filed in the Infrastructure & Operations - Relops: Puppet Component.

How To

System Description

This section describes how PuppetAgain is built at Mozilla. External implementations may not have all of these bells and whistles. This link contains details oriented toward Mozilla IT and ops folks.

The Goals

PuppetAgain should be usable as a whole for folks outside of Mozilla, Inc. who want to build similar systems (see "Organizations" below)
Client images should proceed automatically from base image install to a fully-operational state. While refimages may be employed, this is done only as an optimization.
We do not keep distinct reference images. Reference images are used only as an optimization to avoid pounding the puppet servers when installing dozens of new hosts. When a new refimage snapshot needs to be made, a fresh machine is rebuilt from scratch, snapshotted, and then returned to service.
OS does not imply role. Roles are defined in node declarations, by including toplevel::* classes.
Include all necessary dependencies. Debugging dependency errors when building a new reference system is no fun.
Documentation (here) is a part of the patch.

See ReleaseEngineering/PuppetAgain/HowTo/Hack on PuppetAgain for more detail

Organizations

Each distinct instance of puppetagain is referred to as an organization, and tagged with a short identifier (e.g., "moco" for the mozilla releng instance, or "seamonkey" for seamonkey). Within an organization, configuration and secrets are shared, and everything runs from the same set of manifests. Configuration and secrets can differ between organizations.

Puppetmasters

PuppetAgain masters are managed by PuppetAgain. Each organization can have 1 or more masters, arranged in a cluster (with one cluster per organization). There is one "distinguished master" in the cluster. This master is distinguished only for purposes of simplifying synchronization -- the cluster will continue to operate indefinitely without the distinguished master, although master-master communication (secrets and CRLs) will not work.

See the following for more details, noting that most of this is not required for an external PuppetAgain implementation.

Puppet Versions

The releng puppet infrastructure will strive to keep up to date with the most recent stable versions released by Puppet Labs.

Base Images and Puppetizing

The base images for this infrastructure are barely-modified OS installs. They have just enough installed that they can connect to a puppet server, get certificates, and puppetize on boot.

Note that, while most of PuppetAgain is intended to be easily replicated, the deployment system is probably not easily replicated, and is best left out of any external implementations.

Custom Facts, Functions, Types, and Providers

Custom code is documented in the page for the module that contains it. Code that doesn't have a more appropriate home is in shared.

Stages

Stages need to be defined globally in Puppet manifests, and this is done in manifests/stages.pp. The following stages are available, aside from 'main', the default stage.

network - This stage should handle any network related configurations for some specific cases (like AWS)
packagesetup - This stage should handle any preliminaries required for package installations, so that subsequent package installations do not need to require them explicitly.
users - This stage creates user accounts; while this is normally automatically required, the requirement doesn't work with the temporary 'darwinuser' type.

Nodes

manifests/nodes.pp defines all of the nodes the puppet masters recognize. Note that all nodes are defined for all masters. This file is a symlink to $org-nodes.pp, e.g., moco-nodes.pp. With this arrangement, each organization can make node changes without any risk to other organizations.

In anticipation of using an external node classifier (ENC), node definitions should only include classes - do not define any resources within nodes. In general, the included classes should be in the toplevel module.

Host-specific values are specified as node-scope variables, as these are easier to represent in an ENC. Such variables (including some Puppet gotchas) are described in node-scope variables.

Node definitions also specify a host's aspects, e.g., $aspects = [ 'staging' ].

Configuration

Per-organization configuration is read from manifests/config.pp, which is a symlink to $org-config.pp, similar to that for nodes. The config.pp file defines a "config" class that inherits from "config::base". It is free to express the configuration using any mechanism available to puppet. For some organizations, simple puppet literals will do, while more complex organizations will want to perform some more sophisticated automatic generation of configuration. See config for more.

Secrets and External Data

See ReleaseEngineering/PuppetAgain/Secrets and ReleaseEngineering/PuppetAgain/Extsync.

Data

Puppet deals with a lot of big files - packages, mostly. We don't want these in hg! They are instead managed as data. This means several big file trees available at http://repos/$treename and, from puppet, at puppet://$treename. See ReleaseEngineering/PuppetAgain/Data for details on what's available, how it is implemented, and some how-tos.

This data is available outside of Mozilla via HTTP and rsync at http://puppetagain.pub.build.mozilla.org/data and rsync://puppetagain.pub.build.mozilla.org/data.

Packages

See ReleaseEngineering/PuppetAgain/Packages for information about proper handling of packages in PuppetAgain.

Aspects

Taking a page from Aspect Oriented Programming, PuppetAgain implements Aspect Oriented Puppet. Aspects cross-cut the concerns represented by the toplevel hierarchy. For example, whether a host is a staging host, whether it is loaned out, etc. See ReleaseEngineering/PuppetAgain/Aspects for details.

Source Code

The manifests are at https://github.com/mozilla/build-puppet.

History

Releng once used a puppet infrastructure based on Puppet-0.24.8, and manifests at http://hg.mozilla.org/build/puppet-manifests/. This had a few weaknesses:

lots of assumptions and fragile dependencies based on bugs in 0.24.8
very few modules - mostly manifest files, organized per slave type, rather than per service/purpose
many references to external files which are not as available as the repo itself
puppet manifests assume some manual ref-image steps; external exact reproduction is extremely difficult

Dustin started work on a new puppet deployment - chronicled at User:Djmitche/New Releng Puppet Infrastructure. That's this puppet.

Training notes

Puppet Fundamentals: https://public.etherpad-mozilla.org/p/puppet-training-nov-2016