Ansible vs Puppet – An Overview of the Solutions

This is part 1/2 in a series. For part #2 see: Ansible vs Puppet – Hands-On with Ansible.

Having recently joined Delphix in a DevOps role, I was tasked with increasing the use of a configuration management tool within the R&D infrastructure. Having used Puppet for the past 3 years at VMware and Virtual Instruments I was inclined to deploy another Puppet-based infrastructure at Delphix, but before doing so I thought I’d take the time to research the new competitors in this landscape.

Opinions From Colleagues

I started by reaching out to three former colleagues and asking for their experiences with alternatives to Puppet, keeping in mind that two of the three are seasoned Chef experts. Rather than giving me a blanket recommendation, they responded with questions of their own:

  • How does the application stack look? Is is a typical web server, app, and db stack or more complex?
  • Are you going to be running the nodes in-house, cloud, or hybrid? What type of hypervisors are involved?
  • What is the future growth? (ie. number of nodes/environments)
  • How much in-depth will the eng team be involved in writing recipes, manifests, or playbooks?

After describing Delphix’s infrastructure and use-cases (in-house VMware; not a SaaS company/product but there are some services living in AWS; less than 100 nodes; I expect the Engineering team to be involved in writing manifests/cookbooks), I received the following recommendations:

Colleague #1: “Each CM tool will have its learning curve but I believe the decision will come down to what will the engineers adapt to the easiest. I would tend to lean towards Ansible for the mere fact thats it’s the easiest to implement. There are no agents to be installed and works well with vSphere/Packer. You will most likely have the engineers better off on it better since they know Python.”

Colleague #2: “I would say that if you don’t have a Puppet champion (someone that knows it well and is pushing to have it running) it is probably not worth for this. Chef would be a good solution but it has a serious learning curve and from your details it is not worth either. I would go with something simple (at least in this phase); I think Ansible would be a good fit for this stage. Maybe even with some Docker here or there ;). There will be no state or complicated cluster configs, but you seem to not need that. You don’t have to install any agent and just need SSH.“

The third colleague was also in agreement that I should take a good look at Ansible before starting with deployment of Puppet.

Researching Ansible on My Own

Having gotten similar recommendations from 3/3 of my former colleagues, I spent some time looking into Ansible. Here are some major differences between Ansible & Puppet that I gathered from my research:

  • Server Nodes
    • Puppet infrastructure generally contains 1 (or more) “puppetmaster” servers, along with a special agent package installed on each client node. (Surprisingly, the Puppetization of the puppetmaster server itself is an issue that does not have a well-defined solution)
    • Ansible has neither a special master server, nor special agent executables to install. The executor can be any machine with a list (inventory) of the nodes to contact, the Ansible playbooks, and the proper SSH keys/credentials in order to connect to the nodes.
  • Push vs Pull
    • Puppet nodes have special client software and periodically check into a puppet master server to “pull” resource definitions.
    • Ansible follows a “push” workflow. The machine where Ansible runs from SSH’s into the client machines and uses SSH to copy files, remotely install packages, etc. The client machine VM requires no special setup outside of a working installation of Python 2.5+.
  • Resources & Ordering
    • Puppet: Resources defined in a Puppet manifest are not applied in order of their appearance (ex: top->bottom) which is confusing for people who come from conventional programming languages (C, Java, etc). Instead resources are applied randomly, unless explicit resource ordering is used. Ex: “before”, ”require”, or chaining arrows.
    • Ansible: The playbooks are applied top-to-bottom, as they appear in the file. This is more intuitive for developers coming from other languages. An example playbook that can be read top-down: https://gist.github.com/phips/aa1b6df697b8124f1338
  • Resource Dependency Graphs
    • Internally, Puppet internally creates a directed graph of all of the resources to be defined in a system along with the order they should be applied in. This is a robust way of representing the resources to be applied and Puppet can even generate a graph file so that one can visualize everything that Puppet manages. On the down side building this graph is susceptible to “duplicate resource definition” errors (ex: multiple definitions of a given package, user, file, etc). Also, conflicting rules from a large collection of 3rdparty modules can lead to circular dependencies.
    • Since Ansible is basically a thin-wrapper for executing commands over SSH, there is no resource dependency graph built internally. One could view this as a weakness as compared with Puppet’s design but it also means that these “duplicate resource” errors are completely avoided. The simpler design lends itself to new users understanding the flow of the playbook more easily.
  • Batteries Included vs DIY
  • Language Extensibility
  • Syntax
  • Template Language
    • Puppet templates are based upon Ruby’s ERB.
    • Ansible templates are based upon Jinja2, which is a superset of Django’s templating language. Most R&D orgs will have more experience with Django.
  • DevOps Tool Support

Complexity & Learning Curve

I wanted to call this out even though it’s already been mentioned several times above. Throughout my conversations with my colleagues and in my research, it seemed that Ansible was a winner in terms of ease of adoption. In terms of client/server setup, resource ordering, “batteries included”, Python vs Ruby, and Jinja vs ERB, I got the impression that I’d have an easier time getting teammates in my R&D org to get a working understanding of Ansible.

Supporting this sentiment was a post I found about the Taxi startup Lyft abandoning Puppet because the dev team had difficulty learning Puppet. From “Moving Away from Puppet”:

“[The Puppet code base] was large, unwieldy and complex, especially for our core application. Our DevOps team was getting accustomed to the Puppet infrastructure; however, Lyft is strongly rooted in the concept of ‘If you build it you run it’. The DevOps team felt that the Puppet infrastructure was too difficult to pick up quickly and would be impossible to introduce to our developers as the tool they’d use to manage their own services.”

Getting Hands-On

Having done my research on Puppet vs Ansible, I was now ready to dive in and implement some sample projects. How did my experience with Ansible turn out? Read on in Part #2 of this series: Ansible vs Puppet – Hands-On with Ansible.

Advertisements

Building Vagrant Boxes with Nested VMs using Packer

In “Improving Developer Productivity with Vagrant” I discussed the productivity benefits gained from using Vagrant in our software development tool chain. Here are some more details about the mechanics of how we created those Vagrant boxes as part of every build of our product.

Using Packer to Build VMware-Compatible Vagrant Boxes

Packer is a tool for creating machine images which was also written by Hashicorp, the authors of Vagrant. It can build machine images for almost any type of environment, including Amazon AWSDocker, Google Compute Engine, KVM, Vagrant, VMwareXen, and more.

We used Packer’s built-in VMware builder and Vagrant post-processor to create the Vagrant boxes for users to run on their local desktops/laptops via VMware Fusion or Workstation.

Note: This required each user to install Vagrant’s for-purchase VMware plugin. In our usage of running Vagrant boxes locally we noted that the VMware virtualization providers delivered far better IO performance and stability than the free Oracle VirtualBox provider. In short, the for-purchase Vagrant-VMware plugin was worth every penny!

Running VMware Workstation VMs Nested in ESXi

One of the hurdles I came across in integrating the building of the Vagrant boxes into our existing build system is that Packer’s VMware builder needs to spin up a VM using Workstation or Fusion in order to perform configuration of the Vagrant box. Given that our builds were already running in static VMs, this meant that we needed to be able to run Workstation VMs nested within an ESXi VM with a Linux guest OS!

This sort of VM-nesting was somewhat complicated to setup in the days of vSphere 5.0, but in vSphere 5.1+ this has become a lot simpler. With vSphere 5.1+ one just needs to make sure that their ESXi VMs are running with “Virtual Hardware Version 9” or newer, and one must enable “Hardware assisted virtualization” for the VM within the vSphere web client.

Here’s what the correct configuration for supporting nested VMs looks like:

2014-09-28 02.05.03 pm

Packer’s Built-in Remote vSphere Hypervisor Builder

One question that an informed user of Packer may correctly ask is: “Why not use Packer’s built-in Remote vSphere Hypervisor Builder and create the VM directly on ESXi? Wouldn’t this remove the need for running nested VMs?”

I agree that this would be a better solution in theory. There are several reasons why I chose to go with nested VMs instead:

  1. The “Remote vSphere Hypervisor Builder” requires manually running an “esxcli” command on your ESXi boxes to enable some sort of “GuestIP hack”. Doing this type of configuration on our production ESXi cluster seemed sketchy to me.
  2. The “Remote vSphere Hypervisor Builder” doesn’t work through vSphere, but instead directly ssh’es into your ESXi boxes as a privileged user in order to create the VM. The login credentials for that privileged ESXi/ssh user must be kept in the Packer build script or some other area of our build system. Again, this seems less than ideal to me.
  3. As far as I can tell from the docs, the “Remote vSphere Hypervisor Builder” only works with the “vmware-iso” builder and not the “vmware-vmx” builder. This would’ve painted us into a corner as we had plans to switch from the “vmware-iso” builder to the “vmware-vmx” builder once it had become available.
  4. The “Remote vSphere Hypervisor Builder” was not available when I implemented our nested VM solution because we were early adopters of Packer. It was easier to stick with a working solution that we already had 😛

Automating the Install of VMware Workstation via Puppet

One other mechanical piece I’ll share is how we automated the installation of VMware Workstation 10.0 into our static build VMs. Since all of the build VM configuration is done via Puppet, we could automate the installation of Workstation 10 with the following bit of Puppet code:

# Install VMware Workstation 10
  $vmware_installer = '/mnt/devops/software/vmware/VMware-Workstation-Full-10.0.0-1295980.x86_64.bundle'
  $vmware_installer_options = '--eulas-agreed --required'
  exec {'Install VMware Workstation 10':
    command => "${vmware_installer} ${vmware_installer_options}",
    creates => '/usr/lib/vmware/config',
    user    => 'root',
    require => [Mount['/mnt/devops'], Package['kernel-default-devel']],
  }