Management Chef or Puppet? Lead image: Lead Image © Giuseppe Ramos, 123RF.com and Paulus Rusyanto, Fotolia.com

Choosing between the leadingopen source configuration managers

Puppet or Chef?

Puppet and Chef are competing open source tools for configuration management. Which tool is right for your network? Read on for some pros and cons. By Martin Loschwitz

The migration to the cloud went hand in hand with a paradigm change in the design of modern IT solutions. Where IT configurations were once small, manageable, and defined at the department level, the big picture has now become the focus of attention: Setups from dozens of endpoints need to work as one well-integrated unit.

Automation plays a key role. With more systems covering larger spaces, many admins prefer to invest their time up front in labor-saving custom tools that will save time and energy on repetitive tasks.

Puppet [1] and Chef [2] are two open source configuration management tools that skillfully support admins with comprehensive, automated server management. Chef and Puppet are direct competitors. Admins who use at least one of these solutions consistently and correctly can significantly simplify their everyday lives.

However, Puppet and Chef both have pitfalls that you'll need to steer around. This article is a kind of inventory, based on my experience, of some of the issues concerning practical operations with Chef or Puppet. You will find countless comparisons of Puppet and Chef [3] on the web; depending on the source, sometimes Puppet wins, and sometimes Chef prevails – the choice ultimately comes down to the details. Read on for some issues to consider before you choose: Puppet or Chef?

Declarative or Imperative?

Chef is essentially imperative; Puppet is declarative. In concrete terms, this means: With Chef, the admin defines which commands need to be processed and in what order by creating a cookbook with commands in the appropriate order. In Puppet, admins instead describe a desired state using the Puppet manifest and then define the steps that lead to this condition.

Resources can be connected via dependencies. One faulty run of the Puppet agent on a system does not necessarily mean that the node is unreachable. To fulfill dependencies – even beyond the borders of hosts – it might be necessary to run Puppet multiple times on a host.

Ultimately, the Puppet agent always leaves a system in a consistent state, although consistent does not necessarily mean that all the steps were actually performed on the host. A design difference of this magnitude is rarely the decisive reason for or against Puppet or Chef, but it does significantly contribute to acceptance of the solution within the admin team. After introducing the solutions, it is not easily possible to switch to another; a careful evaluation of the options in advance is therefore necessary in any case.

The Typical Setup

The issue of resources is of overriding importance for both Puppet and Chef. A typical Puppet setup is based on a central server that provides its surrounding hosts with data. The host is known as the Puppet master server; all agents on all your computers call it to obtain information about their own configurations.

The Puppet master server can sometimes become the bottleneck. If you have watched a Puppet agent at work, you might have noticed the log line that reads Fetching Service Catalog. The service catalog is basically a list of all the resources that will be managed by Puppet on each host.

A resource, in turn, is an individual service, whether a configuration file you need to install or a service you need to launch. The devil is in the details again, because the agent does not generate its service catalog on the target server itself. Instead, the master server generates the catalog, then the client downloads it and works through the list of resources.

In smaller installations, the system load for generating service catalogs is not a problem because the requests from the Puppet agents are distributed over time. Individual catalogs cause very little load on the master server if the system is powerful enough. The master servers in larger setups quickly start to suffer under the burden of many simultaneous requests (Figure 1).

Figure 1: In Puppet, a large part of the load is typically borne by the Puppet server. Configuration steps are compiled as a catalog on the server and then sent to the client.

The situation can deteriorate to the extent that individual clients can often no longer download their catalogs when other clients are also running Puppet agent calls.

One solution to this problem is equipping the Puppet master server with more powerful hardware. Many CPUs, a good helping of RAM, and an SSD can help alleviate the bottleneck effect on the master. If you use Chef, you are lucky, because Chef is designed differently: The clients generate their configurations themselves, even through a central Chef server is used.

Another solution to the potential bottleneck caused by the Puppet master is a "masterless" configuration, that is, a setup that does not rely on a master server. The advantage of eliminating the master is that there is no longer a central instance causing performance problems. However, this approach is fraught with security risks: Each server stores all parts of the Puppet configuration that relate to it – possibly including the definitions of other servers; security-wise, a masterless setup is a real nightmare.

Passwords

It has become customary to avoid storing passwords in plain text in files, but in the standard version of Puppet, you have no other choice. Even passwords that are stored within Puppet on the Ruby-based Facter framework [4] are unencrypted. If you run a Puppet setup without a master, all the configuration files must reside on the hosts. That means you will also find almost all of the configuration files on the hosts. This type of design is hacker heaven: If you crack a box, you can capture several passwords for different services and thus explore the network almost at will.

Chef used to suffer from a similar problem, but now Encrypted Data Bags [5] let you store information server side in encrypted form. If you use Puppet, you must make do with third-party software. To properly protect the data from prying eyes, use Hiera and the Hiera extension hiera-eyaml [6] by Tom Poulton.

However, the Hiera solution describes a specific Puppet setup, which is unlikely to exist in most corporations. Anyone who operates a classic Puppet environment without Hiera currently has no options for encryption. This problem is aggravated in masterless Puppet setups, given the need to protect the systems against unauthorized access.

Rollback

Puppet and Chef also take quite different approaches when it comes to rolling back actions. The procedure is clear with Chef: Because a cookbook is essentially a collection of commands, you can logically define a counter-function for each function. If a cookbook contains a rollback parameter, you can call it to revoke the changes already implemented and restore a previous state. Many Chef cookbooks use rollback parameters already.

In Puppet, the issue is more complex. Because of the previously mentioned declarative nature of the host configuration, a Puppet setup only defines a desired state; Puppet doesn't have a clue what the previous state on a system looked like following a call, although it does know what the current state should look like.

A useful rollback is basically impossible with Puppet. To undo configuration changes, the admin needs to store the desired previous state in the form of a manifesto in Puppet and then apply it to the affected hosts. This task can be tricky in extreme cases: When Puppet overwrites files during a call, it does not create a backup.

Admins thus need to create a backup manually and before Puppet deployment. Moreover, best practices dictate sending any changes through a staging installation before rolling them out into production.

Scary Updates

The last and biggest stumbling block I will describe relates to updating components that belong to an automated installation. Chef has left a lasting impression with admins in this regard, and not a good one.

Time and time again, changes to the Chef server in the scope of Chef updates have affected compatibility with previous versions. At worst, the admin then sits sweating in front of a dead server, wondering what to do next. Puppet has largely been spared problems with server and client updates, at least so far, but that does not mean you can always expect things to go well when system components are updated.

It is typically not the Puppet components themselves that cause trouble, but Puppet modules off the Internet. The modules are typically only quality-controlled by their own authors. Whatever plugins you install yourself from PuppetForge should therefore be regarded as untested (Figure 2).

Figure 2: Quality checks for Puppet modules, like this one for OpenStack Neutron, are mostly left entirely to the developer, which can lead to problems in the real world.

Module updates can also be a problem, especially when they are extensive: The Puppet modules for OpenStack bear witness to the disaster that awaits admins in a worst-case scenario.

The update of OpenStack Havana to Icehouse, for example, introduced so many new features and incompatibilities that manifests for the predecessors were practically useless. When combined with the lack of rollback mechanisms in Puppet, this kind of setup necessitates an appropriate staging system that gives admins the ability to test new modules or configurations customized for the new situation.

Quality of External Modules

For both Puppet and Chef, the quality of many external modules leaves much to be desired. Puppet users experience this problem over and over when they need to integrate a module off the web with one of the External Node Classifiers; I'm referring to front ends such as Foreman [7], the Puppet Dashboard, or, more frequently of late, Hiera.

The UIs work similarly: Using classes, admins can assign individual functions to their hosts. For this principle to work, however, all essential functions of a Puppet module must be available as a class.

Many Puppet modules assume that the host configuration is defined in a site manifest (site.pp) and that functions for hosts can be called without any detours – but this is a function the graphical front ends usually lack (Figure 3).

Figure 3: In some cases, whole classes or groups disappear from Puppet modules or Chef cookbooks – much to the dismay of the admins during upgrades.

The excitement of discovering precisely the Puppet module you need can quickly turn to frustration if the module comes without wrapper classes for its functions. With Chef, the quality scatter is just as bad as with Puppet (Figure 4), although many US corporations prefer to use Chef with cookbooks rather than Puppet with modules.

Figure 4: Chef comes with its own pitfalls, although they are different from the problems you'll face with Puppet.

Conclusions

An automated configuration manager is an important feature of many modern networks but it can pose some very specific problems. The leading open source candidates, Puppet and Chef, each do many things well, but neither is perfect.

Which tool to use could depend on the details of your network. This article highlighted some considerations that might help you when you come to the choice of Puppet or Chef.