Big Bubbles (no troubles)

What sucks, who sucks and you suck

Node Inheritance in Puppet

Inheritance - specifically node inheritance - in Puppet seems like a great time saver and an ideal way to simplify your configurations. Until you encounter The Node That Must Be Slightly Different.

Once upon a time, before the language evolved and Puppetlabs were able to start advising you not to do it, it was a common Puppet pattern to define a ‘dummy’ basenode containing your common, global settings and then inherit that throughout one or more further levels of node definitions. This makes sense, because you don’t want to define the same group of settings and classes for each one of ten, fifty, a hundred or eleventy-lots nodes, and then have to update that group for them all again when something changes. The problem is, once you’ve inherited something from a ‘higher’ node, it’s set in stone. And then, inevitably, particularly if you deal with enterprise networks rather than nice, clean, homogenous web platforms, you will need to add the one node that must inherit all the common settings but tweak one of them just slightly.

For example, we include our ‘ssh’ class in the basenode, because it’s a core protocol, it’s pervasive and we never conceived of a time, barring the heat-death of the universe, when we would not want to run SSH identically on every single node ever, regardless of OS or role. (The ssh class thus rolls out the same secured SSH configuration across the board and ensures that the SSH daemon is kept running.) This worked fine and we didn’t need to give it another thought - until one day we gained a new customer who had their own custom SFTP application that had to bind to port 22, meaning that we needed to run our SSH daemon on another port. Problem: we have to inherit the basenode to gain all the common settings for name servers, SNMP managers, etc. But doing that also means pulling in the standard SSH configuration that runs on port 22. And because we’ve inherited it, we can’t override it later for the node in question. This doesn’t work:

node basenode {
  include ssh
}
node snowflake inherits basenode {
  $ssh_port = '20220'	# not visible to ssh class
}

There are two possible workarounds if you stick with inheritance in cases like this:

  1. Use an external facts plugin, and define a local fact in a file on the node to specify the overriding value. Modify your class to use the fact value if it exists or a default. You can even write a Puppet class to create the fact file, although your policy then requires two runs to set everything up correctly (the first to create the fact, the second to read it and use it).

  2. Query an external lookup for the value, or use a default. For example, with extlookup:
    $ssh_port = extlookup('ssh_port', 22)
    (And then use $ssh_port in your sshd_config template or wherever.)

We used to favour the first method, but have now moved to the second because it feels cleaner and does everything in one run. (Also, the facts-dot-d plugin has been deprecated.) In the future, we’ll probably move to Hiera once the Puppet release we’re using has it built in. Either way ensures that you can set and obtain a value at the time of use of the class, rather than relying on a lower scope to influence something that it can’t.

There’s a third way, which is to gather all your common includes and settings into a single, general class and then include that on each node. We do that too, for some things. We even pass in parameters that control whether certain individual classes get included or not (e.g. nodes that have their own custom MTA config because the customer needs to handle mail locally or, for some unearthly reason, wants DKIM, don’t get the generic mail client class). Here’s another example: we set up all our admin accounts via Puppet, and assign fixed UIDs to them. But we have some legacy nodes where the customer already assigned some of the same UIDs for their application. In these cases, we ‘turn off’ the admin user class via a boolean parameter to the general class. (In the future, we might define a per-node base UID value to allow our accounts to be remapped to an available range, again via an external lookup.)

Because inevitably, ‘general’ turns out not to be quite general enough.

My larger point is: don’t be fooled by many of the Puppet examples into thinking that simple, homogenised platforms are the typical use case. Puppet (and its documentation) likes your nodes to be as similar as possible, but if your real world implementation is messier than that, you can still adapt it - yes, perhaps with some mess - to work around the differences.