Testing Ansible Roles with Test Kitchen

Recently while attending DevOps Days Austin 2015, I participated in a breakout session focused on how to test code for configuration management tools like Puppet, Chef, and Ansible. Having started to use Ansible to manage our infrastructure at Delphix I was searching for a way to automate the testing of our configuration management code across a variety of platforms, including Ubuntu, CentOS, RHEL, and Delphix’s custom Illumos-based OS, DelphixOS. Dealing with testing across all of those platforms is a seemingly daunting task to say the least!

Intro to Test Kitchen

The conversation in that breakout session introduced me to Test Kitchen (GitHub), a tool that I’ve been very impressed by and have had quite a bit of fun writing tests for. Test Kitchen is a tool for automated testing of configuration management code written for tools like Ansible. It automates the process of spinning up test VMs, running your configuration management tool against those VMs, executing verification tests against those VMs, and then tearing down the test VMs.

What’s makes Test Kitchen so powerful and useful is its modular design:

Using Test Kitchen

After learning about Test Kitchen at the DevOps Days conference, I did some more research and stumbled across the following presentation which was instrumental in getting started with Test Kitchen and Ansible: Testing Ansible Roles with Test Kitchen, Serverspec and RSpec (SlideShare).

In summary one needs to add three files to their Ansible role to begin using Test Kitchen:

  • A “.kitchen.yml” file at the top-level. This file describes:
    • The driver to use for VM provisioning. Ex: Vagrant, AWS, Docker, etc.
    • The provisioner to use. Ex: Puppet, Chef, Ansible.
    • A list of 1 or more operating to test against. Ex: Ubuntu 12.04, Ubuntu 14.04, CentOS 6.5, or even a custom VM image specified by URL.
    • A list of test suites to run.
  • A “test/integration/test-suite-name/test-suite-name.yml” file which contains the Ansible playbook to be applied.
  • One or more test files in “test/integration/test-suite-name/test-driver-name/”. For example, when using the BATS test-runner to run a test suite named “default”: “test/integration/default/bats/my-test.bats”.

Example Code

A full example of Test Kitchen w/Ansible is available via the delphix.package-caching-proxy Ansible role in Delphix’s GitHub repo. Here are direct links to the aforementioned files/directories:682240

Running Test Kitchen

Using Test Kitchen couldn’t be easier. From the directory that contains your “.kitchen.yml” file, just run “kitchen test” to automatically create your VMs, configure them, and run tests against them:

$ kitchen test
-----> Starting Kitchen (v1.4.1)
-----> Cleaning up any prior instances of 
-----> Destroying ...
 Finished destroying  (0m0.00s).
-----> Testing 
-----> Creating ...
 Bringing machine 'default' up with 'virtualbox' provider...
 ==> default: Importing base box 'opscode-ubuntu-14.04'...
==> default: Matching MAC address for NAT networking...
 ==> default: Setting the name of the VM: kitchen-ansible-package-caching-proxy-default-ubuntu-1404_default_1435180384440_80322
 ==> default: Clearing any previously set network interfaces...
 ==> default: Preparing network interfaces based on configuration...
 default: Adapter 1: nat
 ==> default: Forwarding ports...
 default: 22 => 2222 (adapter 1)
 ==> default: Booting VM...
 ==> default: Waiting for machine to boot. This may take a few minutes...

..  ...

-----> Running bats test suite
 ✓ Accessing the apt-cacher-ng vhost should load the configuration page for Apt-Cacher-NG
 ✓ Hitting the apt-cacher proxy on the proxy port should succeed
 ✓ The previous command that hit ftp.debian.org should have placed some files in the cache
 ✓ Accessing the devpi server on port 3141 should return a valid JSON response
 ✓ Accessing the devpi server via the nginx vhost should return a valid JSON response
 ✓ Downloading a Python package via our PyPI proxy should succeed
 ✓ We should still be able to install Python packages when the devpi contianer's backend is broken
 ✓ The vhost for the docker registry should be available
 ✓ The docker registry's /_ping url should return valid JSON
 ✓ The docker registry's /v1/_ping url should return valid JSON
 ✓ The front-end serer's root url should return http 204
 ✓ The front-end server's /_status location should return statistics from our web server
 ✓ Accessing http://www.google.com through our proxy should always return a cache miss
 ✓ Downloading a file that is not in the cache should result in a cache miss
 ✓ Downloading a file that is in the cache should result in a cache hit
 ✓ Setting the header 'X-Refresh: true' should result in a bypass of the cache
 ✓ Trying to purge when it's not in the cache should return 404
 ✓ Downloading the file again after purging from the cache should yield a cache miss
 ✓ The yum repo's vhost should return HTTP 200

 19 tests, 0 failures
 Finished verifying  (1m52.26s).
-----> Kitchen is finished. (1m52.49s)

And there you have it, one command to automate your entire VM testing workflow!

Next Steps

Giving individual developers on our team the ability to quickly run a suite of automated tests is a big win, but that’s only the first step. The workflow we’re planning is to have Jenkins also run these automated Ansible tests every time someone pushes to our git repo. If those tests succeed we can automatically trigger a run of Ansible against our production inventory. If, on the other hand, the Jenkins job which runs the tests is failing (red), we can use that to prevent Ansible from running against our production inventory. This would be a big win for validating infrastructure changes before pushing them to production.

ansible_logo_black_square

Advertisements

Ansible Role for Package Hosting & Caching

The Operations Team at Delphix has recently published an Ansible role called delphix.package-caching-proxy. We are using this role internally to host binary packages (ex. RPMs, Python “pip” packages), as well as to locally cache external packages that our infrastructure and build process depends upon.

This role provides:

It also provides the hooks for monitoring the front-end Nginx server through collectd. More details here.

Why Is this Useful?

This sort of infrastructure can be useful in a variety of situations, for example:

  • When your organization has remote offices/employees whose productivity would benefit from having fast, local access to large binaries like ISOs, OVAs, or OS packages.
  • When your dev process depends on external dependencies from services that are susceptible to outages, ex. NPM or PyPI.
  • When your dev process depends on third-party artifacts that are pinned to certain versions and you want a local copy of those pinned dependencies in case those specific versions become unavailable in the future.

Sample Usage

While there are a verity of configuration options for this role, the default configuration can be deployed with an Ansible playbook as simple as the following:

---
- hosts: all
  roles:
    - delphix.package-caching-proxy

Underneath the Covers

This role works by deploying a front-end Nginx webserver to do HTTP caching, and also configures several Nginx server blocks (analogous to Apache vhosts) which delegate to Docker containers for the apps that run the Docker Registry, the PyPI server, etc.

Downloading, Source Code, and Additional Documentation/Examples

This role is hosted in Ansible Galaxy at https://galaxy.ansible.com/list#/roles/3008, and the source code is available on GitHub at: https://github.com/delphix/ansible-package-caching-proxy.

Additional documentation and examples are available in the README.md file in the GitHub repo, at: https://github.com/delphix/ansible-package-caching-proxy/blob/master/README.md

Acknowledgements

Shoutouts to some deserving folks:

  • My former Development Infrastructure Engineering Team at VMware, who proved this idea out by implementing a similar set of caching proxy servers for our global remote offices in order to improve developer productivity.
  • The folks who conceived the Snakes on a Plane Docker Global Hack Day project.

How Should I Get Application Configuration into my Docker Containers?

A commonly asked question by folks that are deploying their first Docker containers into production is, “How should I get application configuration into my Docker container?” The configuration in question could be settings like the number of worker processes for a web service to run with, JVM max heap size, or the connection string for a database. The reality is that there are several standard ways to do this, each with their own pros and cons.

The ~3.5 Ways to Send Configuration to your Dockerized Apps

1. Baking the Configuration into the Container

Baking your application’s configuration into a Docker image is perhaps the easiest pattern to understand. Basically one can use commands within the “Dockerfile” to drop configuration files into the right places via the Dockerfile’s COPY directive, or modify those configuration files at image build time with “sed” or ”echo” via the RUN command.

If there’s a container available on the Docker Hub Registry that does everything you want save for one or two config settings, you could fork that “Dockerfile” on GitHub, make modifications to the “Dockerfile” in your GitHub fork to drop in whatever configuration changes you want, then add it as a new container on the Docker Hub Registry.

That is what I did for:

Pros:

Cons:

  • Since the configuration is baked into the image, any future configuration changes will require additional modifications to the image’s build file (its “Dockerfile”) and a new build of the container image itself.

2a. Setting the Application Configuration Dynamically via Environment Variables

This is a commonly-used pattern for images on the Docker Hub Registry. For an example, see the environment variables “POSTGRES_USER” and “POSTGRES_PASSWORD” for the official PostgreSQL Docker image.

Basically, when you “docker run” you will pass in pre-defined environment variables like so: "docker run -e SETTING1=foo -e SETTING2=bar ... <image name>". From there, the container’s entry point (startup script) will look for those environment variables, and “sed” or “echo” them into whatever relevant config files the application uses before actually starting the app.

It’s worth mentioning that the container’s entry point script should contain reasonable defaults for each of those environment variables if the invoker does not pass those environment variables in, so that the container will always be able to start successfully. 

Pros:

  • This approach makes your container more dynamic in terms of configuration. 

Cons:

  • You are sacrificing dev/prod parity because now folks can configure the container to behave differently in dev & prod.
  • Some configurations are too complex to model with simple key/value pairs, for example an nginx/apache virtual host configuration.

2b. Setting the Application Configuration Dynamically via Environment Variables

This is a similar idea to using environment variables to pass in configuration, but instead the container’s startup script will reach out to a key-value (KV) store on the network like Consul or etcd to get configuration parameters.

This makes it possible to do more complex configurations than is possible with simple environment variables, because the KV store can have a hierarchical structure of many levels. It’s worth noting that widely-used tooling exists for grabbing the values from the KV store substituting them into your config files. Tools like confd even allow for automatic app-reloading upon changes to the KV configuration. This allows you to make your app’s configuration truly dynamic!

See:

Pros:

  • This approach makes your container more dynamic in terms of configuration.
  • The KV store allows for more complex & dynamic configuration information 

Cons:

  • This introduces an external dependency of the KV store, which must be highly-available.
  • You are sacrificing dev/prod parity because now folks can configure the container to behave differently in dev & prod.

3. Map the Config Files in Directly via Docker Volumes

Docker Volumes allow you to map any file/directory from the host OS into a container, like so: “docker run -v <source path>:<dest path> ...”

Therefore if the config file(s) for your containerized app happened to be available on the filesystem of the base OS, then you could map that config file (or dir) into the container. Ex:

“docker run -v /home/dan/my_statsd_config.conf:/etc/statsd.conf hopsoft/graphite-statsd”

Pros:

  • You don’t have to modify the container to get arbitrary configurations in.

Cons:

  • You lose dev/prod parity because now your app’s config can be anything
  • If you’re doing this in production, now you have to get that external config file onto the base OS for sharing into the container (a configuration management tool like Ansible, Chef, or Puppet comes in handy here)

Conclusion

As you can see, there are many potential ways to get application configuration to your Dockerized apps, each with their own trade-offs. Which way is best? It really depends on how much dynamism you require and whether or not you want the extra burden of managing dependencies like a KV store.

Ansible vs Puppet – Hands-On with Ansible

This is part 2/2 in a series. For part #1 see: Ansible vs Puppet – An Overview of the Solutions.

Notes & Findings From Going Hands-On with Ansible

After playing with Ansible for a week to Ansible-ize Graphite/Grafana (via Docker) and Jenkins (via an Ansible Galaxy role), here are my notes about Ansible:

  • “Batteries Included” and OSS Module Quality
    • While Ansible does include more modules out of the box, the “batteries included” claim is misleading. IMO an Ansible shop will have to rely heavily upon Ansible Galaxy to find community-created modules (Ex: for installing Jenkins, dockerd, or ntp), just as a Puppet shop would have to rely upon PuppetForge.
    • The quality and quantity of the modules on Ansible Galaxy is about on par with what is available at PuppetForge. Just as with PuppetForge, there are multiple implementations for any given module (ex: nginx, ntp, jenkins), each with their own quirks, strengths, and deficiencies.
    • Perhaps this is a deficiency of all of the configuration management systems. Ultimately a shop’s familiarity with Python or Ruby may add some preference here.
  • Package Installations
    • Coming from Puppet-land this seemed worthy of pointing out: Ansible does not abstract an OS’s package manager the same way that Puppet does with the “package” resource. Users explicitly call out the package manager to be used. Ex: the “apt” module or “yum” module. One can see that Ansible provides a tad bit less abstraction. FWIW a package installed via “pip” or “gem” in Puppet still requires explicit naming of the package provider. Not saying that either is better or worse here. Just a noticable difference to an Ansible newbie.
  • Programming Language Constructs
  • Noop Mode
  • Agent-less
    • Ansible’s agent-less, SSH-based push workflow actually was notably easier to deal with than a Puppetmaster, slave agents, SSL certs, etc.
  • Learning Curve
    • If I use my imagination and pretend that I was starting to use a configuration management tool for the first time, I perceive that I’d have an easier time picking up Ansible. Even though I’m not a fan of YAML by any stretch of the imagination, Ansible playbooks are a bit easier to write & understand than Puppet manifests.

Conclusions

After three years of using Puppet at VMware and Virtual Instruments, the thought of not continuing to use the market leader in configuration management tools seemed like a radical idea when it was first suggested to me. After spending several weeks researching Ansible and using it hands-on, I came to the conclusion that Ansible is a perfectly viable alternative to Puppet. I tend to agree with Lyft’s conclusion that if you have a centralized Ops team in change of deployments then they can own a Puppet codebase. On the other hand if you want more wide-spread ownership of your configuration management scripts, a tool with a shallower learning curve like Ansible is a better choice.

Ansible vs Puppet – An Overview of the Solutions

This is part 1/2 in a series. For part #2 see: Ansible vs Puppet – Hands-On with Ansible.

Having recently joined Delphix in a DevOps role, I was tasked with increasing the use of a configuration management tool within the R&D infrastructure. Having used Puppet for the past 3 years at VMware and Virtual Instruments I was inclined to deploy another Puppet-based infrastructure at Delphix, but before doing so I thought I’d take the time to research the new competitors in this landscape.

Opinions From Colleagues

I started by reaching out to three former colleagues and asking for their experiences with alternatives to Puppet, keeping in mind that two of the three are seasoned Chef experts. Rather than giving me a blanket recommendation, they responded with questions of their own:

  • How does the application stack look? Is is a typical web server, app, and db stack or more complex?
  • Are you going to be running the nodes in-house, cloud, or hybrid? What type of hypervisors are involved?
  • What is the future growth? (ie. number of nodes/environments)
  • How much in-depth will the eng team be involved in writing recipes, manifests, or playbooks?

After describing Delphix’s infrastructure and use-cases (in-house VMware; not a SaaS company/product but there are some services living in AWS; less than 100 nodes; I expect the Engineering team to be involved in writing manifests/cookbooks), I received the following recommendations:

Colleague #1: “Each CM tool will have its learning curve but I believe the decision will come down to what will the engineers adapt to the easiest. I would tend to lean towards Ansible for the mere fact thats it’s the easiest to implement. There are no agents to be installed and works well with vSphere/Packer. You will most likely have the engineers better off on it better since they know Python.”

Colleague #2: “I would say that if you don’t have a Puppet champion (someone that knows it well and is pushing to have it running) it is probably not worth for this. Chef would be a good solution but it has a serious learning curve and from your details it is not worth either. I would go with something simple (at least in this phase); I think Ansible would be a good fit for this stage. Maybe even with some Docker here or there ;). There will be no state or complicated cluster configs, but you seem to not need that. You don’t have to install any agent and just need SSH.“

The third colleague was also in agreement that I should take a good look at Ansible before starting with deployment of Puppet.

Researching Ansible on My Own

Having gotten similar recommendations from 3/3 of my former colleagues, I spent some time looking into Ansible. Here are some major differences between Ansible & Puppet that I gathered from my research:

  • Server Nodes
    • Puppet infrastructure generally contains 1 (or more) “puppetmaster” servers, along with a special agent package installed on each client node. (Surprisingly, the Puppetization of the puppetmaster server itself is an issue that does not have a well-defined solution)
    • Ansible has neither a special master server, nor special agent executables to install. The executor can be any machine with a list (inventory) of the nodes to contact, the Ansible playbooks, and the proper SSH keys/credentials in order to connect to the nodes.
  • Push vs Pull
    • Puppet nodes have special client software and periodically check into a puppet master server to “pull” resource definitions.
    • Ansible follows a “push” workflow. The machine where Ansible runs from SSH’s into the client machines and uses SSH to copy files, remotely install packages, etc. The client machine VM requires no special setup outside of a working installation of Python 2.5+.
  • Resources & Ordering
    • Puppet: Resources defined in a Puppet manifest are not applied in order of their appearance (ex: top->bottom) which is confusing for people who come from conventional programming languages (C, Java, etc). Instead resources are applied randomly, unless explicit resource ordering is used. Ex: “before”, ”require”, or chaining arrows.
    • Ansible: The playbooks are applied top-to-bottom, as they appear in the file. This is more intuitive for developers coming from other languages. An example playbook that can be read top-down: https://gist.github.com/phips/aa1b6df697b8124f1338
  • Resource Dependency Graphs
    • Internally, Puppet internally creates a directed graph of all of the resources to be defined in a system along with the order they should be applied in. This is a robust way of representing the resources to be applied and Puppet can even generate a graph file so that one can visualize everything that Puppet manages. On the down side building this graph is susceptible to “duplicate resource definition” errors (ex: multiple definitions of a given package, user, file, etc). Also, conflicting rules from a large collection of 3rdparty modules can lead to circular dependencies.
    • Since Ansible is basically a thin-wrapper for executing commands over SSH, there is no resource dependency graph built internally. One could view this as a weakness as compared with Puppet’s design but it also means that these “duplicate resource” errors are completely avoided. The simpler design lends itself to new users understanding the flow of the playbook more easily.
  • Batteries Included vs DIY
  • Language Extensibility
  • Syntax
  • Template Language
    • Puppet templates are based upon Ruby’s ERB.
    • Ansible templates are based upon Jinja2, which is a superset of Django’s templating language. Most R&D orgs will have more experience with Django.
  • DevOps Tool Support

Complexity & Learning Curve

I wanted to call this out even though it’s already been mentioned several times above. Throughout my conversations with my colleagues and in my research, it seemed that Ansible was a winner in terms of ease of adoption. In terms of client/server setup, resource ordering, “batteries included”, Python vs Ruby, and Jinja vs ERB, I got the impression that I’d have an easier time getting teammates in my R&D org to get a working understanding of Ansible.

Supporting this sentiment was a post I found about the Taxi startup Lyft abandoning Puppet because the dev team had difficulty learning Puppet. From “Moving Away from Puppet”:

“[The Puppet code base] was large, unwieldy and complex, especially for our core application. Our DevOps team was getting accustomed to the Puppet infrastructure; however, Lyft is strongly rooted in the concept of ‘If you build it you run it’. The DevOps team felt that the Puppet infrastructure was too difficult to pick up quickly and would be impossible to introduce to our developers as the tool they’d use to manage their own services.”

Getting Hands-On

Having done my research on Puppet vs Ansible, I was now ready to dive in and implement some sample projects. How did my experience with Ansible turn out? Read on in Part #2 of this series: Ansible vs Puppet – Hands-On with Ansible.

Running Docker Containers on Windows & Mac via Vagrant

If you ever wanted to run a Docker container on a non-Linux platform (ex: Windows or Mac), here’s a “Vagrantfile” which will allow you to do that quickly and easily with Vagrant.

The Vagrantfile

For the purposes of this post, suppose that we want to run the “tutum/wordpress” Docker container on our Mac. That WordPress container comes with everything needed for a fully-functioning WordPress CMS installation, including MySQL and WordPress’s other dependencies.

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  config.vm.box = "puphpet/ubuntu1404-x64"

  config.vm.provision "docker" do |d|
    d.run "tutum/wordpress", args: "-p '80:80'"
  end

  config.vm.network "forwarded_port", guest: 80, host: 8080

end

Explanation

  • This “Vagrantfile” will download the “puphpet/ubuntu1404-x64” Vagrant box which is a widely-used VirtualBox/VMware image of Ubuntu 14.04.
  • Once that Vagrant box is downloaded and the VM has booted, Vagrant will run the “docker” provisioner. The “docker” provisioner will the download and run the “tutum/wordpress” Docker container, with the “docker run” argument to expose port 80 of the container to port 80 of the Ubuntu 14.04 OS.
  • The final line of our “Vagrantfile” tells Vagrant to expose port 80 of the Ubuntu guest OS to port 8080 of our host OS (i.e., Windows or Mac OS). When we access http://localhost:8080 from our host OS, that TCP traffic will be transparently forwarded to port 80 of the guest OS which will then transparently forward the traffic to port 80 of the container. Neat!

Results

After running “vagrant up” the necessary Vagrant box and Docker container downloads will start automatically:

==> default: Waiting for machine to boot. This may take a few minutes
...
==> default: Machine booted and ready!
==> default: Forwarding ports...
 default: -- 80 => 8080
 default: -- 22 => 2222
==> default: Configuring network adapters within the VM...
==> default: Waiting for HGFS kernel module to load...
==> default: Enabling and configuring shared folders...
 default: -- /Users/tehranian/Downloads/boxes/docker-wordpress: /vagrant
==> default: Running provisioner: docker...
 default: Installing Docker (latest) onto machine...
 default: Configuring Docker to autostart containers...
==> default: Starting Docker containers...
==> default: -- Container: tutum/wordpress

Once our “vagrant up” has completed, we can access the WordPress app that is running within the Docker container by pointing our web browser to http://localhost:8080. This takes us to the WordPress setup wizard where we can finish the installation and configuration of WordPress:

Wordpress Setup Wizard

 

Voila! A Docker container running quickly and easily on your Mac or Windows PC!

Testing Puppet Code with Vagrant

At Virtual Instruments we use Vagrant boxes to locally test our Puppet changes before pushing those changes into production. Here are some details about how we do this.

Puppet Support in Vagrant

Vagrant has built-in support for using Puppet as a machine provisioner, either by contacting a Puppet master to receive modules and manifests or by running “puppet apply” with a local set of modules and manifests (aka. masterless Puppet). We chose to use masterless Puppet with Vagrant in our test environment due to its simplicity of setup.

Starting with a Box for the Base OS

Before we can use Puppet to provision our machine, we need to have a base OS available with Puppet installed. At Virtual Instruments our R&D infrastructure is standardized on Ubuntu 12.04 which means that we want our Vagrant base box to be a otherwise minimal installation of Ubuntu 12.04 with Puppet also installed. Luckily this is a very common configuration and there are pre-made Vagrant boxes available for download at VagrantCloud.com. We’re going to use the box named “puppetlabs/ubuntu-12.04-64-puppet“.

If you are using a different OS you can search the Vagrant Cloud site for a Vagrant base box that matches the OS of your choice. See: https://vagrantcloud.com/discover/featured

If you can find a base box for your OS but not a base box for that OS which has Puppet pre-installed, you can use one of @mitchellh‘s nifty Puppet-bootstrap scripts with a Vagrant Shell Provisioner to get Puppet installed into your base box. See the README included in that repo for details: https://github.com/hashicorp/puppet-bootstrap/blob/master/README.md

The Vagrantfile

Having found a suitable base box, one can use the following “Vagrantfile” to start that box and invoke Puppet to provision the machine.

VAGRANTFILE_API_VERSION = "2"

# set the following hostname to a name that Puppet will match against. ex:
# "vi-cron9.lab.vi.local"
MY_HOSTNAME = "vi-nginx-proxy9.lab.vi.local"


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  # from: https://vagrantcloud.com/search?utf8=✓&sort=&provider=&q=puppetlabs+12.04
  config.vm.box = "puppetlabs/ubuntu-12.04-64-puppet"
  config.vm.hostname = MY_HOSTNAME

  # needed to load hiera data for puppet
  config.vm.synced_folder "hieradata", "/data/puppet-production/hieradata"

  # Vagrant/Puppet docs:
  #   http://docs.vagrantup.com/v2/provisioning/puppet_apply.html
  config.vm.provision :puppet do |puppet|
    puppet.facter = {
      "is_vagrant_vm" => "true"
    }
    puppet.hiera_config_path = "hiera.yaml"
    puppet.manifest_file  = "site.pp"
    puppet.manifests_path = "manifests"
    puppet.module_path = "modules"
    # puppet.options = "--verbose --debug"
  end

end

Breaking Down the Vagrantfile

Setting Our Hostname

# Set the following hostname to a name that Puppet will match against. ex:
# "vi-cron9.lab.vi.local"
MY_HOSTNAME = "vi-nginx-proxy9.lab.vi.local"

Puppet determines which resources to apply based on the hostname of our VM. For ease of use, our “Vagrantfile” has a variable called “MY_HOSTNAME” defined at the top of the file which allows users to easily define which node they want to provision locally.

Defining Which Box to Use

# From: https://vagrantcloud.com/search?utf8=✓&sort=&provider=&q=puppetlabs+12.04
config.vm.box = "puppetlabs/ubuntu-12.04-64-puppet"

The value for “config.vm.box” is the name of the box we found on vagrantcloud.com. This allows Vagrant to automatically download the base VM image from the Vagrant Cloud service.

Puppet-Specific Configurations

  # Needed to load Hiera data for Puppet
  config.vm.synced_folder "hieradata", "/data/puppet-production/hieradata"

  # Vagrant/Puppet docs:
  #   http://docs.vagrantup.com/v2/provisioning/puppet_apply.html
  config.vm.provision :puppet do |puppet|
    puppet.facter = {
      "is_vagrant_vm" => "true"
    }
    puppet.hiera_config_path = "hiera.yaml"
    puppet.manifest_file  = "site.pp"
    puppet.manifests_path = "manifests"
    puppet.module_path = "modules"
    # puppet.options = "--verbose --debug"
  end

Here we are setting up the configuration of the Puppet provisioner. See the full documentation for Vagrant’s masterless Puppet provisioner at: https://docs.vagrantup.com/v2/provisioning/puppet_apply.html

Basically this code:

  • Sets up a shared folder to make our Hiera data available to the guest OS
  • Set a custom Facter fact called “is_vagrant_vm” to “true“. This fact can then be used by our manifests for edge-cases around running VMs locally (like routing collectd/SAR data to a non-production Graphite server to avoid pollution of the production Graphite server)
  • Tells the Puppet provisioner where the root Puppet manifest file is and where necessary Puppet modules can be found.

Conclusion

Vagrant is a powerful tool for testing Puppet code changes locally. With a simple “vagrant up” one can fully provision a VM from scratch. One can also use the “vagrant provision” command to locally test incremental updates to Puppet code as it is iteratively being developed, or to test changes to long-running mutable VMs.