ReleaseEngineering/PuppetAgain/HowTo/Hack on PuppetAgain

From MozillaWiki
Jump to: navigation, search

Hacking

PuppetAgain is (or, at least, will be) a code module, and as such people working on PuppetAgain follow the usual Mozilla process for making changes:

  • hack
  • post a patch, request review from a module owner or peer
  • repeat until r+ (and sr+ if necessary)
  • commit
  • monitor and handle any problems as a result of the commit

The last point is particularly important, since puppet masters automatically update to the latest commit, and start deploying that to hosts.

Exceptions to Review Requirement

  • a=nodechange - self-approval if you're only changing node definitions, and only for nodes you're responsible for
  • a=org-only - self-approval if you're changing $org-config.pp or $org-nodes.pp and your org does not require review
  • a=versionbump - self-approval if you're only changing the version of a package and have considered the implications for all orgs of the change
  • a=userdir - self-approval if you're changing your *own* user directory files with the explicit rule of excluding large files
  • r=bustage - self-approval if you're backing out a broken update, in order to restore service.

Coding guidelines

Classes

Class name should not contain "-".

Spacing, indentation, and whitespace

Module manifests should follow best practices for spacing, indentation, and whitespace.

Manifests:

  • Must use two-space soft tabs,
  • Must not use literal tab characters,
  • Must not contain trailing whitespace,
  • Must include trailing commas after all resource attributes and parameter definitions,
  • Must end the last line with a new line,
  • Must use one space between the resource type and opening brace, one space between the opening brace and the title, and no spaces between the title and colon.

Arrays and hashes

To increase readability of arrays and hashes, it is almost always beneficial to break up the elements on separate lines. Use a single line only if that results in overall better readability of the construct where it appears, such as when it is very short. When breaking arrays and hashes, they should have:

  • Each element on its on line,
  • Each new element line indented one level,
  • First and last lines used only for the syntax of that data type.

Documentation comments

Documentation comments for Puppet Strings should be included for each of your classes, defined types, functions, and resource types and providers. See the documentation section of this guide for complete documentation recommendations

Arrow alignment

Hash rockets (=>) in a resource’s attribute/value list may be aligned. The hash rockets should be placed one space ahead of the longest attribute name. Nested blocks must be indented by two spaces, and hash rockets within a nested block may be aligned (one space ahead of the longest attribute name). Each attribute should be placed on a separate line, with the exception that very short or single purpose resource declarations may be declared on a single line.

Attribute ordering

If a resource declaration includes an ensure attribute, it should be the first attribute specified so that a user can quickly see if the resource is being created or deleted.

File modes

  • POSIX numeric notation must be represented as 4 digits.
  • POSIX symbolic notation must be a string.
  • You should not use file mode with Windows; instead use the acl module.
  • You should use numeric notation whenever possible.
  • The file mode attribute should always be a quoted string, never an integer.

Resource attribute indentation and alignment

Resource attributes must be uniformly indented in two spaces from the title.

Display order of parameters

In parameterized class and defined type declarations, required parameters must be listed before optional parameters (that is, parameters with defaults). Required parameters are parameters which are not set to anything, including undef. For example, parameters such as passwords or IP addresses might not have reasonable default values.

Note that treating a parameter like a namevar and defaulting it to $title or $name does not make it a required parameter. It should still be listed following the order recommended here.

Parameter indentation and alignment

Parameters to classes or defined types must be uniformly indented in two spaces from the title. The equals sign should be aligned.

Variable format

When defining variables you must only use numbers, lowercase letters, and underscores. Do not use uppercased letters within a word, such as “CamelCase”, as it introduces inconsistency in style. You must not use dashes, as they are not syntactically valid.

Defaults for case statements and selectors

Case statements must have default cases. If you want the default case to be “do nothing,” you must include it as an explicit default: {} for clarity’s sake. Case and selector values must be quoted. Selectors should omit default selections only if you explicitly want catalog compilation to fail when no value matches.

Documenting Puppet code

Use Puppet Strings code comments to document your Puppet classes, defined types, functions, and resource types and providers. Strings processes the README and comments from your code into HTML or JSON format documentation. This allows you and your users to generate detailed documentation for your module.

Include comments for each element (classes, functions, defined types, parameters, and so on) in your module. If used, comments must precede the code for that element. Comments should contain the following information, arranged in this order:

  • A description giving an overview of what the element does.
  • Any additional information about valid values that is not clear from the data type. (For example, if the data type is [String], but the value must specifically be a path.)
  • The default value, if any for that element.

Multiline descriptions must be uniformly indented by at least one space.

For more coding guidelines, please visit puppet style guide page

Test your code on the local machine

Before committing your changes you should check your code with puppet-lint. A lightweight Docker image has been created to make this easier, as installing puppet-lint natively requires a number of dependencies.

 docker pull mozillarelops/puppet-linter
  • Run it, inserting the path to your checkout of the puppet repo:
 docker run --rm --name puppet-linter -v <path_to_puppet>:/puppet mozillarelops/puppet-linter

Patch Guidelines / Review Checklist

Be Generic

PuppetAgain should be usable externally. Do not hard-code things that external users may want to adjust, and extract them from CSV so they can be overridden easily. That includes:

  • SSH authorized_keys (note that known_hosts don't hurt)
  • usernames (cltbld isn't universal)
  • hostnames

Changes Only

Do not write resources that will be instantiated on every puppet run. This makes it difficult to tell if a particular puppet run has actually done anything, and will result in misleading data in the puppet dashboard. Note that the "unless" or "onlyif" parameters to Exec, while they still run an external command (and thus aren't especially efficient), can effectively prevent a command from running.

Be Modular

Keep code where it belongs:

  • Node declarations should only set variables and include toplevel classes (this is looking forward to using an ENC).
  • Toplevel classes should only include other classes, although parameterized classes are OK. Don't do anything substantial directly in a toplevel class -- make a new module and include it.
  • Package classes should only install packages. These classes should be extremely formulaic, and readers should not need to look at them to figure out what they do. Conversely, nothing else should install packages.
  • Include requirements near where they are needed. It never hurts to include packages::foo from several places, if foo is used in several places.
  • Many modules have a few variables that are needed throughout the module. In this case, make a modulename::settings class that contains only these variables, then include it in all of the classes where those variables are needed.
  • In general, more, smaller, more easily-digested classes are better than single, large classes.

DRY

(Don't Repeat Yourself) This applies in a few cases:

  • If your action module has a long list of repetitive resources (file, exec, etc.), it can probably be refactored nicely into a utility class or define. This applies even for long lists of file resources. If that utility class is not purpose-specific, put it in its own module so it can be shared.
  • When customizing a configuration file for different purposes, it's tempting to just use several files and source => "puppet:///modules/${module_name}/myfile.${purpose}". In many cases those files share common code, which will be difficult to keep in sync as it evolves. In this case, prefer to use a template or concatenation to construct the configuration file with the common parts only specified once.

Use the Toplevel Naming Structure

The toplevel naming hierarchy and inheritance hierarchy should be identical. That is, toplevel::foo::bar::baz should be a subclass of toplevel::foo::bar, which is a subclass of toplevel::foo, which is a subclass of toplevel.

Common Directories

We tend to use the same directories for multiple projects. That's what the dirs module is for! Creating extra dirs classes is easy and never hurts. Remember that the class names in this module *exactly* mirror the directory names.

Packages

Remember that we're running puppet on multiple operating systems. If a new package is only required on one operating system, include a case $operatingsystem and call fail() in the default branch, so that it will fail loudly if the class is applied on another operating system. If a utility is installed by default on some operating systems, include a clause in the case with a comment indicating that.

DMGs

Every DMG's provenance should be documented somehow in the puppet hg repository. For publicly available DMGs, this can be described in the $package.pp file. Otherwise, add a $package-dmg.sh script that builds the DMG from its prerequisites. See ReleaseEngineering/PuppetAgain/HowTo/Build_DMGs.

DMGs that are hand-built but not customized in any way should be installed by packages::xxx, rather than packages::mozilla::xxx. Include the xxx-dmg.sh script in the modules/packages/manifests directory, alongside the manifest file.

Custom Packages

If you create a custom package, it should be installed with a class named python::mozilla::$package, so it's clear that it's customized. The package should be documented on the packages module page.

For RPMs, the spec file should be checked into https://github.com/mozilla/build-puppet in modules/packages/manifests/mozilla, with the same base name as the .pp file. When you land the changes, the resulting RPM and SRPM should both be added to the appropriate repository. The contents (but maybe not filename) of the spec file embedded in the SRPM should match the checked-in spec file exactly. This system allows (a) review of spec changes and (b) easy rebuilding of custom RPMS. See ReleaseEngineering/PuppetAgain/HowTo/Modify a Custom RPM for hints on how to build RPMs.

Strong Dependencies

Wherever a dependency might exist, specify it. Dependency errors are difficult to spot when you're incrementally building a system as you develop your patch. Unspecified dependencies can even work in production -- for a while, until a new and unrelated resource shakes up the nondeterministic ordering and suddenly things don't work.

Graphs

You can get puppet to give you a dependency graph in .dot format! See http://bitfieldconsulting.com/puppet-dependency-graphs - although the results are enormous and due to some puppet bugs not that helpful.

 yum install graphviz
 dot -Tpng /var/lib/puppet/state/graphs/expanded_relationships.dot > relationships.png

Note that in current versions of puppet, these are fairly unusable, as they show all relationships as bidirectional.

Class Dependencies

In general, try to use require => Class['some::class'] rather than requiring a basic resource that you know the class defines (especially avoid require => Package['some-package']). This way, you are not depending on an implementation detail of that other class. This also creates many more dependencies (on all of the concrete resources in that class), which keeps things predictable.

Unfortunately, these dependencies do *not* extend to included classes. See ReleaseEngineering/PuppetAgain/Anchoring Classes for the workaround. Every class that can be required by another class (which are explicitly labeled as such in the wiki) should be anchored.

Tips

Keep in mind that puppet automatically requires parent directories, so File['/builds/slave'] automatically requires File['/builds'] if it exists. Also, note that in many cases packages will write default configuration files when they are installed, so you should make sure that the File resource installing the custom configuration requires the package class.

Document

Adjust/amend the documentation on the wiki while landing the change after review.

No Bad Habits

  • Even if all of the FooAPI servers are running OpenSolaris, $::operatingsystem=="Solaris" does not mean this is a FooAPI server. OS does not imply role. Roles are defined in node declarations, by including toplevel::* classes.
  • It can be tempting to put "just one resource" into an already-existing class. User::builder seems to be a popular target. Adding new classes is easy, and avoids surprises. If the resource you're adding isn't obviously tied to the name of the class you're adding it to, put it in another (probably new) class.

Config Files

  • For file-resources where you can easily implement a run-parts format design, do so, rather than modify the global config.
    • Examples crontab vs cron.d/*, http.d.
  • You should specify the parent dir to purge and recursive, so that when you remove a resource from puppet it removes itself from the machine.
  • The crontab resource has complex gotchyas - *always* use /etc/cron.d/* files for cron.
  • When installing files that might be executed immediately, e.g., crontabs or application scripts, use require to ensure that all prerequisites are installed first, to avoid unnecessary failures.

Revertible

Where it's not too difficult, try to write classes so that *removing* a resource will cause a corresponding removal on clients. For example, when sudoers::custom { foo: command => "foo" } is removed from a client's catalog, the 'foo' command can no longer be run from sudo. To do otherwise could cause nasty security problems!

Syntax

Puppet gets a little unpredictable around what it does and does not allow in names. To be safe, stick with [a-zA-Z_][a-zA-Z0-9_] for class and variable names, just like C identifiers. In particular, do not use '-' in class or variable names (the latter won't work; the former will almost work but is illegal).

Nit-Worthy Code Style

While everyone will have different opinions on the best Coding Style, one thing we can all agree on, life is better when we all agree. So the following are Nit's we should all strive to adhere to, but exceptions can be made with good reasoning. (mistakes can sneak in, if so, just edit them out next time you touch that area of code)

Whitespace and Tabs

  • TABS: NEVER in a puppet manifest file. ONLY when necessary in other files.
  • Trailing Whitespace: Shouldn't exist.
  • EOL: Always Unix style line-endings.
  • EOF: Always include a newline at EOF
  • Indentation: Four Space indents.

Comments

  • We prefer the # style comments throughout, C-Style is frowned upon.
  • Manifests that are not very obvious by their name should have a very brief comment at the top of the file describing it, these comments are informative, normative reference is the wiki docs.
  • When it helps understanding of why something is in the manifest (or not in the manifest) please include brief comment saying so.

Tips

In a template, if you have a variable that may be a single value or an array, and you need it to be an array, use

 Array(@var)

to convert it to an array. If the array may be created from other arrays in the puppet DSL (which does not support concatenation of arrays), use

 Array(@var).flatten

Landing Changes

Before you land your changes, think carefully about the ramifications. There are some common gotchas with Puppet:

Changing Puppet Config

If your change requires changes to puppet.conf or related files, be sure it applies fresh! As an example, landing a change that requires a plugin, and setting pluginsync = true at the same time will likely not work -- the puppet.conf change will not be made if the catalog compilation fails due to the missing plugin. This particular case has been addressed (pluginsync is already enabled), but be alert to similar situations.

Adding Packages

When adding packages, be aware that the order in which the packages and your updated manifests are applied is difficult to ensure.

Most packages are installed with ensure => latest, which will upgrade the packages on a client during the next puppet run *iff* the packages are synchronized to the relevant puppetmaster *and* the repository cache on that client is cleared (see ReleaseEngineering/PuppetAgain/Modules/packages). Without this, it's quite likely that puppet agent will run successfully without upgrading the packages. On the other hand, the ensure => latest can also cause upgrades to occur before the corresponding manifest changes are committed to hg.

There are a few ways to work around this. One is to -- at least temporarily -- specify an exact version of the package in the manifests. Then puppet runs will fail in any corner cases, preventing further damage.