VMware AppCatalyst First Impressions

As previously mentioned in my DockerCon 2015 Wrap-Up, one of the more practical
announcements from last week’s DockerCon was that VMware announced a free variant of Fusion called AppCatalyst. The availability of AppCatalyst along with a corresponding plugin for Vagrant written by Fabio Rapposelli gives developers a less buggy and more performant alternative to using Oracle’s VirtualBox as their VM provider for Vagrant. vmware_cloud_logo

Here is the announcement itself along with William Lam‘s excellent guide to getting started w/AppCatalyst & the AppCatalyst Vagrant plugin.

Taking AppCatalyst for a Test Drive

One of the first things I did when I returned to work after DockerCon was to download AppCatalyst and its Vagrant plugin, and take them for a spin. By and large, it works as advertised. Getting the VMware Project Photon VM running in Vagrant per William’s guide was a cinch.

Having gotten that Photon VM working, I immediately turned my attention to getting an arbitrary “vmware_desktop” Vagrant box from HashiCorp’s Atlas working. Atlas is HashiCorp’s commercial service, but they make a large collection of community-congtributed Vagrant boxes for various infrastructure platforms freely-available. I figured that I should be able to use Vagrant to automatically download one of the “vmware_desktop” boxes from Atlas and then spin it up locally with AppCatalyst using only a single command, “vagrant up”.

In practice, I hit an issue which Fabio was quick to provide me with a workaround for: https://github.com/vmware/vagrant-vmware-appcatalyst/issues/5

The crux of the issue is that AppCatalyst is geared towards provisioning Linux VMs and not other OS types, ex. Windows. This is quite understandable as VMware would not want to cannibalize Fusion sales for folks that buy Fusion to run Windows on their Macs. Unfortunately this OS identification logic seems to be coming from the “guestos” setting in the box’s .VMX file, and apparently many of the “vmware_desktop” boxes on Altas do not use a value for that VMX key that AppCatalyst will accept. As Fabio suggested, the work-around was to override that setting from the VMX file to a value that AppCatalyst will accept.

A Tip for Starting the AppCatalyst Daemon Automatically

Another minor issue I hit when trying AppCatalyst for the first time was that I’d forgotten to manually start the AppCatalyst daemon, “/opt/vmware/appcatalyst/bin/appcatalyst-daemon”. D’oh!

Because I found it annoying to launch a separate terminal window to start this daemon every time I wanted to interact with AppCatalyst, I followed through on a co-worker’s suggestion to automate the starting of this process on my Mac via launchd. (Thanks Dan K!)

Here’s how I did it:

$ cat >~/Library/LaunchAgents/com.vmware.appcatalyst.daemon.plist <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>com.vmware.appcatalyst.daemon</string>
    <key>Program</key>
    <string>/opt/vmware/appcatalyst/bin/appcatalyst-daemon</string>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/tmp/appcatalyst.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/appcatalyst.log</string>
  </dict>
</plist>
EOF

After logging out and logging back in, the AppCatalyst daemon should be running and its log file will be at “/tmp/appcatalyst.log”. Ex:

$ tail -f /tmp/appcatalyst.log
2015/06/30 09:41:03 DEFAULT_VM_PATH=/Users/dtehranian/Documents/AppCatalyst
2015/06/30 09:41:03 DEFAULT_PARENT_VM_PATH=/opt/vmware/appcatalyst/photonvm/photon.vmx
2015/06/30 09:41:03 DEFAULT_LOG_PATH=/Users/dtehranian/Library/Logs/VMware
2015/06/30 09:41:03 PORT=8080
2015/06/30 09:41:03 Swagger path: /opt/vmware/appcatalyst/bin/swagger
2015/06/30 09:41:03 appcatalyst daemon started.

Since AppCatalyst is still in Tech Preview, I’m hoping VMware adds this sort of auto-start functionality for the daemon before the final release of the software.

Conclusion

If you or your development team is using VirtualBox as your VM provider for Vagrant, go try out AppCatalyst. It’s based on the significantly better technical core of Fusion and if it grows in popularity maybe one day it could be the default provider for Vagrant! 🙂

Advertisements

Testing Ansible Roles with Test Kitchen

Recently while attending DevOps Days Austin 2015, I participated in a breakout session focused on how to test code for configuration management tools like Puppet, Chef, and Ansible. Having started to use Ansible to manage our infrastructure at Delphix I was searching for a way to automate the testing of our configuration management code across a variety of platforms, including Ubuntu, CentOS, RHEL, and Delphix’s custom Illumos-based OS, DelphixOS. Dealing with testing across all of those platforms is a seemingly daunting task to say the least!

Intro to Test Kitchen

The conversation in that breakout session introduced me to Test Kitchen (GitHub), a tool that I’ve been very impressed by and have had quite a bit of fun writing tests for. Test Kitchen is a tool for automated testing of configuration management code written for tools like Ansible. It automates the process of spinning up test VMs, running your configuration management tool against those VMs, executing verification tests against those VMs, and then tearing down the test VMs.

What’s makes Test Kitchen so powerful and useful is its modular design:

Using Test Kitchen

After learning about Test Kitchen at the DevOps Days conference, I did some more research and stumbled across the following presentation which was instrumental in getting started with Test Kitchen and Ansible: Testing Ansible Roles with Test Kitchen, Serverspec and RSpec (SlideShare).

In summary one needs to add three files to their Ansible role to begin using Test Kitchen:

  • A “.kitchen.yml” file at the top-level. This file describes:
    • The driver to use for VM provisioning. Ex: Vagrant, AWS, Docker, etc.
    • The provisioner to use. Ex: Puppet, Chef, Ansible.
    • A list of 1 or more operating to test against. Ex: Ubuntu 12.04, Ubuntu 14.04, CentOS 6.5, or even a custom VM image specified by URL.
    • A list of test suites to run.
  • A “test/integration/test-suite-name/test-suite-name.yml” file which contains the Ansible playbook to be applied.
  • One or more test files in “test/integration/test-suite-name/test-driver-name/”. For example, when using the BATS test-runner to run a test suite named “default”: “test/integration/default/bats/my-test.bats”.

Example Code

A full example of Test Kitchen w/Ansible is available via the delphix.package-caching-proxy Ansible role in Delphix’s GitHub repo. Here are direct links to the aforementioned files/directories:682240

Running Test Kitchen

Using Test Kitchen couldn’t be easier. From the directory that contains your “.kitchen.yml” file, just run “kitchen test” to automatically create your VMs, configure them, and run tests against them:

$ kitchen test
-----> Starting Kitchen (v1.4.1)
-----> Cleaning up any prior instances of 
-----> Destroying ...
 Finished destroying  (0m0.00s).
-----> Testing 
-----> Creating ...
 Bringing machine 'default' up with 'virtualbox' provider...
 ==> default: Importing base box 'opscode-ubuntu-14.04'...
==> default: Matching MAC address for NAT networking...
 ==> default: Setting the name of the VM: kitchen-ansible-package-caching-proxy-default-ubuntu-1404_default_1435180384440_80322
 ==> default: Clearing any previously set network interfaces...
 ==> default: Preparing network interfaces based on configuration...
 default: Adapter 1: nat
 ==> default: Forwarding ports...
 default: 22 => 2222 (adapter 1)
 ==> default: Booting VM...
 ==> default: Waiting for machine to boot. This may take a few minutes...

..  ...

-----> Running bats test suite
 ✓ Accessing the apt-cacher-ng vhost should load the configuration page for Apt-Cacher-NG
 ✓ Hitting the apt-cacher proxy on the proxy port should succeed
 ✓ The previous command that hit ftp.debian.org should have placed some files in the cache
 ✓ Accessing the devpi server on port 3141 should return a valid JSON response
 ✓ Accessing the devpi server via the nginx vhost should return a valid JSON response
 ✓ Downloading a Python package via our PyPI proxy should succeed
 ✓ We should still be able to install Python packages when the devpi contianer's backend is broken
 ✓ The vhost for the docker registry should be available
 ✓ The docker registry's /_ping url should return valid JSON
 ✓ The docker registry's /v1/_ping url should return valid JSON
 ✓ The front-end serer's root url should return http 204
 ✓ The front-end server's /_status location should return statistics from our web server
 ✓ Accessing http://www.google.com through our proxy should always return a cache miss
 ✓ Downloading a file that is not in the cache should result in a cache miss
 ✓ Downloading a file that is in the cache should result in a cache hit
 ✓ Setting the header 'X-Refresh: true' should result in a bypass of the cache
 ✓ Trying to purge when it's not in the cache should return 404
 ✓ Downloading the file again after purging from the cache should yield a cache miss
 ✓ The yum repo's vhost should return HTTP 200

 19 tests, 0 failures
 Finished verifying  (1m52.26s).
-----> Kitchen is finished. (1m52.49s)

And there you have it, one command to automate your entire VM testing workflow!

Next Steps

Giving individual developers on our team the ability to quickly run a suite of automated tests is a big win, but that’s only the first step. The workflow we’re planning is to have Jenkins also run these automated Ansible tests every time someone pushes to our git repo. If those tests succeed we can automatically trigger a run of Ansible against our production inventory. If, on the other hand, the Jenkins job which runs the tests is failing (red), we can use that to prevent Ansible from running against our production inventory. This would be a big win for validating infrastructure changes before pushing them to production.

ansible_logo_black_square

Ansible vs Puppet – An Overview of the Solutions

This is part 1/2 in a series. For part #2 see: Ansible vs Puppet – Hands-On with Ansible.

Having recently joined Delphix in a DevOps role, I was tasked with increasing the use of a configuration management tool within the R&D infrastructure. Having used Puppet for the past 3 years at VMware and Virtual Instruments I was inclined to deploy another Puppet-based infrastructure at Delphix, but before doing so I thought I’d take the time to research the new competitors in this landscape.

Opinions From Colleagues

I started by reaching out to three former colleagues and asking for their experiences with alternatives to Puppet, keeping in mind that two of the three are seasoned Chef experts. Rather than giving me a blanket recommendation, they responded with questions of their own:

  • How does the application stack look? Is is a typical web server, app, and db stack or more complex?
  • Are you going to be running the nodes in-house, cloud, or hybrid? What type of hypervisors are involved?
  • What is the future growth? (ie. number of nodes/environments)
  • How much in-depth will the eng team be involved in writing recipes, manifests, or playbooks?

After describing Delphix’s infrastructure and use-cases (in-house VMware; not a SaaS company/product but there are some services living in AWS; less than 100 nodes; I expect the Engineering team to be involved in writing manifests/cookbooks), I received the following recommendations:

Colleague #1: “Each CM tool will have its learning curve but I believe the decision will come down to what will the engineers adapt to the easiest. I would tend to lean towards Ansible for the mere fact thats it’s the easiest to implement. There are no agents to be installed and works well with vSphere/Packer. You will most likely have the engineers better off on it better since they know Python.”

Colleague #2: “I would say that if you don’t have a Puppet champion (someone that knows it well and is pushing to have it running) it is probably not worth for this. Chef would be a good solution but it has a serious learning curve and from your details it is not worth either. I would go with something simple (at least in this phase); I think Ansible would be a good fit for this stage. Maybe even with some Docker here or there ;). There will be no state or complicated cluster configs, but you seem to not need that. You don’t have to install any agent and just need SSH.“

The third colleague was also in agreement that I should take a good look at Ansible before starting with deployment of Puppet.

Researching Ansible on My Own

Having gotten similar recommendations from 3/3 of my former colleagues, I spent some time looking into Ansible. Here are some major differences between Ansible & Puppet that I gathered from my research:

  • Server Nodes
    • Puppet infrastructure generally contains 1 (or more) “puppetmaster” servers, along with a special agent package installed on each client node. (Surprisingly, the Puppetization of the puppetmaster server itself is an issue that does not have a well-defined solution)
    • Ansible has neither a special master server, nor special agent executables to install. The executor can be any machine with a list (inventory) of the nodes to contact, the Ansible playbooks, and the proper SSH keys/credentials in order to connect to the nodes.
  • Push vs Pull
    • Puppet nodes have special client software and periodically check into a puppet master server to “pull” resource definitions.
    • Ansible follows a “push” workflow. The machine where Ansible runs from SSH’s into the client machines and uses SSH to copy files, remotely install packages, etc. The client machine VM requires no special setup outside of a working installation of Python 2.5+.
  • Resources & Ordering
    • Puppet: Resources defined in a Puppet manifest are not applied in order of their appearance (ex: top->bottom) which is confusing for people who come from conventional programming languages (C, Java, etc). Instead resources are applied randomly, unless explicit resource ordering is used. Ex: “before”, ”require”, or chaining arrows.
    • Ansible: The playbooks are applied top-to-bottom, as they appear in the file. This is more intuitive for developers coming from other languages. An example playbook that can be read top-down: https://gist.github.com/phips/aa1b6df697b8124f1338
  • Resource Dependency Graphs
    • Internally, Puppet internally creates a directed graph of all of the resources to be defined in a system along with the order they should be applied in. This is a robust way of representing the resources to be applied and Puppet can even generate a graph file so that one can visualize everything that Puppet manages. On the down side building this graph is susceptible to “duplicate resource definition” errors (ex: multiple definitions of a given package, user, file, etc). Also, conflicting rules from a large collection of 3rdparty modules can lead to circular dependencies.
    • Since Ansible is basically a thin-wrapper for executing commands over SSH, there is no resource dependency graph built internally. One could view this as a weakness as compared with Puppet’s design but it also means that these “duplicate resource” errors are completely avoided. The simpler design lends itself to new users understanding the flow of the playbook more easily.
  • Batteries Included vs DIY
  • Language Extensibility
  • Syntax
  • Template Language
    • Puppet templates are based upon Ruby’s ERB.
    • Ansible templates are based upon Jinja2, which is a superset of Django’s templating language. Most R&D orgs will have more experience with Django.
  • DevOps Tool Support

Complexity & Learning Curve

I wanted to call this out even though it’s already been mentioned several times above. Throughout my conversations with my colleagues and in my research, it seemed that Ansible was a winner in terms of ease of adoption. In terms of client/server setup, resource ordering, “batteries included”, Python vs Ruby, and Jinja vs ERB, I got the impression that I’d have an easier time getting teammates in my R&D org to get a working understanding of Ansible.

Supporting this sentiment was a post I found about the Taxi startup Lyft abandoning Puppet because the dev team had difficulty learning Puppet. From “Moving Away from Puppet”:

“[The Puppet code base] was large, unwieldy and complex, especially for our core application. Our DevOps team was getting accustomed to the Puppet infrastructure; however, Lyft is strongly rooted in the concept of ‘If you build it you run it’. The DevOps team felt that the Puppet infrastructure was too difficult to pick up quickly and would be impossible to introduce to our developers as the tool they’d use to manage their own services.”

Getting Hands-On

Having done my research on Puppet vs Ansible, I was now ready to dive in and implement some sample projects. How did my experience with Ansible turn out? Read on in Part #2 of this series: Ansible vs Puppet – Hands-On with Ansible.

Running Docker Containers on Windows & Mac via Vagrant

If you ever wanted to run a Docker container on a non-Linux platform (ex: Windows or Mac), here’s a “Vagrantfile” which will allow you to do that quickly and easily with Vagrant.

The Vagrantfile

For the purposes of this post, suppose that we want to run the “tutum/wordpress” Docker container on our Mac. That WordPress container comes with everything needed for a fully-functioning WordPress CMS installation, including MySQL and WordPress’s other dependencies.

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  config.vm.box = "puphpet/ubuntu1404-x64"

  config.vm.provision "docker" do |d|
    d.run "tutum/wordpress", args: "-p '80:80'"
  end

  config.vm.network "forwarded_port", guest: 80, host: 8080

end

Explanation

  • This “Vagrantfile” will download the “puphpet/ubuntu1404-x64” Vagrant box which is a widely-used VirtualBox/VMware image of Ubuntu 14.04.
  • Once that Vagrant box is downloaded and the VM has booted, Vagrant will run the “docker” provisioner. The “docker” provisioner will the download and run the “tutum/wordpress” Docker container, with the “docker run” argument to expose port 80 of the container to port 80 of the Ubuntu 14.04 OS.
  • The final line of our “Vagrantfile” tells Vagrant to expose port 80 of the Ubuntu guest OS to port 8080 of our host OS (i.e., Windows or Mac OS). When we access http://localhost:8080 from our host OS, that TCP traffic will be transparently forwarded to port 80 of the guest OS which will then transparently forward the traffic to port 80 of the container. Neat!

Results

After running “vagrant up” the necessary Vagrant box and Docker container downloads will start automatically:

==> default: Waiting for machine to boot. This may take a few minutes
...
==> default: Machine booted and ready!
==> default: Forwarding ports...
 default: -- 80 => 8080
 default: -- 22 => 2222
==> default: Configuring network adapters within the VM...
==> default: Waiting for HGFS kernel module to load...
==> default: Enabling and configuring shared folders...
 default: -- /Users/tehranian/Downloads/boxes/docker-wordpress: /vagrant
==> default: Running provisioner: docker...
 default: Installing Docker (latest) onto machine...
 default: Configuring Docker to autostart containers...
==> default: Starting Docker containers...
==> default: -- Container: tutum/wordpress

Once our “vagrant up” has completed, we can access the WordPress app that is running within the Docker container by pointing our web browser to http://localhost:8080. This takes us to the WordPress setup wizard where we can finish the installation and configuration of WordPress:

Wordpress Setup Wizard

 

Voila! A Docker container running quickly and easily on your Mac or Windows PC!

Testing Puppet Code with Vagrant

At Virtual Instruments we use Vagrant boxes to locally test our Puppet changes before pushing those changes into production. Here are some details about how we do this.

Puppet Support in Vagrant

Vagrant has built-in support for using Puppet as a machine provisioner, either by contacting a Puppet master to receive modules and manifests or by running “puppet apply” with a local set of modules and manifests (aka. masterless Puppet). We chose to use masterless Puppet with Vagrant in our test environment due to its simplicity of setup.

Starting with a Box for the Base OS

Before we can use Puppet to provision our machine, we need to have a base OS available with Puppet installed. At Virtual Instruments our R&D infrastructure is standardized on Ubuntu 12.04 which means that we want our Vagrant base box to be a otherwise minimal installation of Ubuntu 12.04 with Puppet also installed. Luckily this is a very common configuration and there are pre-made Vagrant boxes available for download at VagrantCloud.com. We’re going to use the box named “puppetlabs/ubuntu-12.04-64-puppet“.

If you are using a different OS you can search the Vagrant Cloud site for a Vagrant base box that matches the OS of your choice. See: https://vagrantcloud.com/discover/featured

If you can find a base box for your OS but not a base box for that OS which has Puppet pre-installed, you can use one of @mitchellh‘s nifty Puppet-bootstrap scripts with a Vagrant Shell Provisioner to get Puppet installed into your base box. See the README included in that repo for details: https://github.com/hashicorp/puppet-bootstrap/blob/master/README.md

The Vagrantfile

Having found a suitable base box, one can use the following “Vagrantfile” to start that box and invoke Puppet to provision the machine.

VAGRANTFILE_API_VERSION = "2"

# set the following hostname to a name that Puppet will match against. ex:
# "vi-cron9.lab.vi.local"
MY_HOSTNAME = "vi-nginx-proxy9.lab.vi.local"


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  # from: https://vagrantcloud.com/search?utf8=✓&sort=&provider=&q=puppetlabs+12.04
  config.vm.box = "puppetlabs/ubuntu-12.04-64-puppet"
  config.vm.hostname = MY_HOSTNAME

  # needed to load hiera data for puppet
  config.vm.synced_folder "hieradata", "/data/puppet-production/hieradata"

  # Vagrant/Puppet docs:
  #   http://docs.vagrantup.com/v2/provisioning/puppet_apply.html
  config.vm.provision :puppet do |puppet|
    puppet.facter = {
      "is_vagrant_vm" => "true"
    }
    puppet.hiera_config_path = "hiera.yaml"
    puppet.manifest_file  = "site.pp"
    puppet.manifests_path = "manifests"
    puppet.module_path = "modules"
    # puppet.options = "--verbose --debug"
  end

end

Breaking Down the Vagrantfile

Setting Our Hostname

# Set the following hostname to a name that Puppet will match against. ex:
# "vi-cron9.lab.vi.local"
MY_HOSTNAME = "vi-nginx-proxy9.lab.vi.local"

Puppet determines which resources to apply based on the hostname of our VM. For ease of use, our “Vagrantfile” has a variable called “MY_HOSTNAME” defined at the top of the file which allows users to easily define which node they want to provision locally.

Defining Which Box to Use

# From: https://vagrantcloud.com/search?utf8=✓&sort=&provider=&q=puppetlabs+12.04
config.vm.box = "puppetlabs/ubuntu-12.04-64-puppet"

The value for “config.vm.box” is the name of the box we found on vagrantcloud.com. This allows Vagrant to automatically download the base VM image from the Vagrant Cloud service.

Puppet-Specific Configurations

  # Needed to load Hiera data for Puppet
  config.vm.synced_folder "hieradata", "/data/puppet-production/hieradata"

  # Vagrant/Puppet docs:
  #   http://docs.vagrantup.com/v2/provisioning/puppet_apply.html
  config.vm.provision :puppet do |puppet|
    puppet.facter = {
      "is_vagrant_vm" => "true"
    }
    puppet.hiera_config_path = "hiera.yaml"
    puppet.manifest_file  = "site.pp"
    puppet.manifests_path = "manifests"
    puppet.module_path = "modules"
    # puppet.options = "--verbose --debug"
  end

Here we are setting up the configuration of the Puppet provisioner. See the full documentation for Vagrant’s masterless Puppet provisioner at: https://docs.vagrantup.com/v2/provisioning/puppet_apply.html

Basically this code:

  • Sets up a shared folder to make our Hiera data available to the guest OS
  • Set a custom Facter fact called “is_vagrant_vm” to “true“. This fact can then be used by our manifests for edge-cases around running VMs locally (like routing collectd/SAR data to a non-production Graphite server to avoid pollution of the production Graphite server)
  • Tells the Puppet provisioner where the root Puppet manifest file is and where necessary Puppet modules can be found.

Conclusion

Vagrant is a powerful tool for testing Puppet code changes locally. With a simple “vagrant up” one can fully provision a VM from scratch. One can also use the “vagrant provision” command to locally test incremental updates to Puppet code as it is iteratively being developed, or to test changes to long-running mutable VMs.

Building Vagrant Boxes with Nested VMs using Packer

In “Improving Developer Productivity with Vagrant” I discussed the productivity benefits gained from using Vagrant in our software development tool chain. Here are some more details about the mechanics of how we created those Vagrant boxes as part of every build of our product.

Using Packer to Build VMware-Compatible Vagrant Boxes

Packer is a tool for creating machine images which was also written by Hashicorp, the authors of Vagrant. It can build machine images for almost any type of environment, including Amazon AWSDocker, Google Compute Engine, KVM, Vagrant, VMwareXen, and more.

We used Packer’s built-in VMware builder and Vagrant post-processor to create the Vagrant boxes for users to run on their local desktops/laptops via VMware Fusion or Workstation.

Note: This required each user to install Vagrant’s for-purchase VMware plugin. In our usage of running Vagrant boxes locally we noted that the VMware virtualization providers delivered far better IO performance and stability than the free Oracle VirtualBox provider. In short, the for-purchase Vagrant-VMware plugin was worth every penny!

Running VMware Workstation VMs Nested in ESXi

One of the hurdles I came across in integrating the building of the Vagrant boxes into our existing build system is that Packer’s VMware builder needs to spin up a VM using Workstation or Fusion in order to perform configuration of the Vagrant box. Given that our builds were already running in static VMs, this meant that we needed to be able to run Workstation VMs nested within an ESXi VM with a Linux guest OS!

This sort of VM-nesting was somewhat complicated to setup in the days of vSphere 5.0, but in vSphere 5.1+ this has become a lot simpler. With vSphere 5.1+ one just needs to make sure that their ESXi VMs are running with “Virtual Hardware Version 9” or newer, and one must enable “Hardware assisted virtualization” for the VM within the vSphere web client.

Here’s what the correct configuration for supporting nested VMs looks like:

2014-09-28 02.05.03 pm

Packer’s Built-in Remote vSphere Hypervisor Builder

One question that an informed user of Packer may correctly ask is: “Why not use Packer’s built-in Remote vSphere Hypervisor Builder and create the VM directly on ESXi? Wouldn’t this remove the need for running nested VMs?”

I agree that this would be a better solution in theory. There are several reasons why I chose to go with nested VMs instead:

  1. The “Remote vSphere Hypervisor Builder” requires manually running an “esxcli” command on your ESXi boxes to enable some sort of “GuestIP hack”. Doing this type of configuration on our production ESXi cluster seemed sketchy to me.
  2. The “Remote vSphere Hypervisor Builder” doesn’t work through vSphere, but instead directly ssh’es into your ESXi boxes as a privileged user in order to create the VM. The login credentials for that privileged ESXi/ssh user must be kept in the Packer build script or some other area of our build system. Again, this seems less than ideal to me.
  3. As far as I can tell from the docs, the “Remote vSphere Hypervisor Builder” only works with the “vmware-iso” builder and not the “vmware-vmx” builder. This would’ve painted us into a corner as we had plans to switch from the “vmware-iso” builder to the “vmware-vmx” builder once it had become available.
  4. The “Remote vSphere Hypervisor Builder” was not available when I implemented our nested VM solution because we were early adopters of Packer. It was easier to stick with a working solution that we already had 😛

Automating the Install of VMware Workstation via Puppet

One other mechanical piece I’ll share is how we automated the installation of VMware Workstation 10.0 into our static build VMs. Since all of the build VM configuration is done via Puppet, we could automate the installation of Workstation 10 with the following bit of Puppet code:

# Install VMware Workstation 10
  $vmware_installer = '/mnt/devops/software/vmware/VMware-Workstation-Full-10.0.0-1295980.x86_64.bundle'
  $vmware_installer_options = '--eulas-agreed --required'
  exec {'Install VMware Workstation 10':
    command => "${vmware_installer} ${vmware_installer_options}",
    creates => '/usr/lib/vmware/config',
    user    => 'root',
    require => [Mount['/mnt/devops'], Package['kernel-default-devel']],
  }

Improving Developer Productivity with Vagrant

As part of improving developer productivity at Virtual Instruments during development of  Virtual Widsom 4.0, I introduced Vagrant to the development team. At the time, the product was being re-architected from a monolithic Java app into a service oriented architecture (SOA). Without Vagrant, the challenge for a given Java developer working on any one of the Java services was that there was no integration environment available for that developer to test the respective service that they were working on. In other words, a developer could run their respective Java service locally, but without the other co-requisite services and databases they couldn’t do anything useful with it.

How Not To Solve

We could have documented a long set of instructions in a wiki, detailing how to setup and run each one of the Java services locally, along with instructions on how to setup and run each of the databases manually, but there would be several problems with this approach:

  1. Following such instructions would be a very manual, time-consuming, and mistake-prone process. The total time on such efforts would be multiplied by the size of the R&D team as each developer would have to duplicate this effort on their own.
  2. Such instructions would be a “living document“, continually changing over time. This means that if Jack followed the instructions on Day X, the instructions that Jane followed on Day X+Y could be potentially different and lead to two very different integration environments.
  3. All of our developers were running Mac OS or Windows laptops, but the product environment was SuSE Linux Enterprise Server 11 (SLES 11). Regardless of how complete our instructions on how to setup the environment could be, there would still be the issue of consistency of environment. If developers were to test their Java services in hand-crafted environments that were not identical to the actual environment that QA tested in or that the customer ran the product in, then we would be sure to hit issues where functionality would work in one developer’s environment, but not in QA or in the customer’s environment! (i.e., “It worked on my box!”)

A Better Approach

logo-952f2ab5

Turning our integration environment into a portable Vagrant box (a virtual machine) solved all of these issues. The Vagrant box was an easily distributable artifact generated by our build process that contained fully configured instances of all of the Java services and databases that comprised our product. Developers could download the Vagrant box and get it running in minutes. The process for running the Vagrant box was so simple that even managers and directors could download a “Vagrantfile” and “vagrant up” to get a recent build running locally on their laptops. Finally, the Vagrant box generated by our build process utilized the identical SLES 11 environment that QA and customers would be running with, so developers would not be running into issues related to differences in environment. I will write a follow-up post about how we use Packer in our build process to create the Vagrant box, but for now I’ll provide some details about our Vagrant box workflow.

The “Vagrantfile”

Here’s a partial sample of our “Vagrantfile” where I’d like to call a few things out:

VAGRANTFILE_API_VERSION = "2"  # Do not modify

VM_NUM_CPUS = "4"
VM_RAM_MB = "4096"
VM_SHOW_CONSOLE = false


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  # Box name & URL to download from
  config.vm.box = "vw-21609-431"
  config.vm.box_url = "http://devnull.vi.local/builds/aruba-images-master/aruba-images-21609/vagrant/vmware/portal_appliance.vmware.21609-431.box"

...

  # Create a private network interface, /dev/eth1. This allows host-only access
  # to the machine using a specific IP. The host OS is available to the guest
  # at the "192.168.33.1" IP address.
  config.vm.network :private_network, ip: "192.168.33.10"

  config.vm.provider "vmware_fusion" do |v|
    v.gui = VM_SHOW_CONSOLE
    v.vmx["memsize"]  = VM_RAM_MB
    v.vmx["numvcpus"] = VM_NUM_CPUS
  end

...

end

Keep in mind that the “Vagrantfile” is executable Ruby code, so there are virtually limitless possibilities for what one can accomplish depending on your needs and desired workflow.

Private Networking and Our “services.conf”

The workflow used by the developers of our Java services is to run the service that they modifying via the IDE in their host OS (ex: Eclipse or IntelliJ), and to have all other services and databases running within the Vagrant box (the guest OS). In order to facilitate the communication between the host OS and guest OS, we direct the “Vagrantfile” to create a private network with static IP addresses for the host and guest. Here our host OS will have the IP “192.168.33.1” while the guest will be available at “192.168.33.10”:

  # Create a private network interface, /dev/eth1. This allows host-only access
  # to the machine using a specific IP. The host OS is available to the guest
  # at the "192.168.33.1" IP address.
  config.vm.network :private_network, ip: "192.168.33.10"

With private networking connectivity in place, we modified our Java services to read the configuration of where to find their peer-services into a hierarchy of configuration files. Ex: When a Java service initializes, it reads the following hierarchy of configuration files to determine how to connect to the other services:

  • /etc/vi/services.conf (the default settings)
  • /vagrant/services.conf
  • ~/services.conf (highest precedence)

Sample contents for these “services.conf” files:

# /vagrant/services.conf

com.vi.ServiceA=192.168.33.1
com.vi.ServiceB=localhost
com.vi.ServiceC=192.168.33.10

The “services.conf” hierarchy allows a developer to direct the service running in their IDE/host OS to connect to the Java services running within the Vagrant box/guest OS (via “~/services.conf”), as needed. It also allows the developer to configure the services within the Vagrant box/guest OS to connect to the Java services running on the host OS via the “/vagrant/services.conf” file. One clarification – The “/vagrant/services.conf” file actually lives on the host OS in the working directory of the “Vagrantfile” that the developer downloads. The file appears as “/vagrant/services.conf” via the default shared folder provided by Vagrant. Having the “/vagrant/services.conf” live on the host OS is especially convenient as it allows for easy editing, and more importantly it provides persistence of the developer’s configuration when tearing down and re-initializing newer versions of our Vagrant box.

Easy Downloading with “box_url”

As part of our workflow I found it to be easiest to have users not download the Vagrant .box file directly, but instead to download the small (~3KB) “Vagrantfile” which in turn contains the URL for the .box file. When the user runs “vagrant up” from the cwd of this “Vagrantfile”, Vagrant will automatically detect that the Vagrant box of the respective name is not in the local library and start to download the Vagrant box from the URL listed in the “Vagrantfile”.

  # Box name & URL to download from
  config.vm.box = "vw-21609-431"
  config.vm.box_url = "http://devnull.vi.local/builds/aruba-images-master/aruba-images-21609/vagrant/vmware/portal_appliance.vmware.21609-431.box"

More details available in the Vagrant docs: http://docs.vagrantup.com/v2/vagrantfile/machine_settings.html Note: Earlier this year the authors of Vagrant released a SaaS service for box distribution called Vagrant Cloud. You may want to look into using this, along with the newer functionality of Vagrant box versioning. We are not using the Vagrant Cloud SaaS service yet as our solution pre-dates the availability of this service and there hasn’t been sufficient motivation to change our workflow.

VM Hardware Customization

In our “Vagrantfile” I wanted to make it dead-simple for people to be able to modify the hardware resources. At VI some developers had very new laptops with lots of RAM while others had older laptops. Putting the following Ruby variables at the top of the “Vagrantfile” made it easy for someone that knows absolutely nothing about Ruby to edit the hardware configuration of their Vagrant box:

VM_NUM_CPUS = "4"
VM_RAM_MB = "4096"
VM_SHOW_CONSOLE = false

...

  config.vm.provider "vmware_fusion" do |v|
    v.gui = VM_SHOW_CONSOLE
    v.vmx["memsize"]  = VM_RAM_MB
    v.vmx["numvcpus"] = VM_NUM_CPUS
  end

Conclusion

In developing an SOA application, having a Vagrant box for developers to integrate their services that are under development has been a enormous boon for developer productivity. Downloading and running a Vagrant box is orders of magnitude faster than configuring and starting services by hand. The Vagrant box also solves the problem of “consistency of environment”, allowing developers to run their code in an environment that closely matches the QA/customer environment. In the post-mortem analysis of our Virtual Wisdom 4.0 release, having Vagrant boxes for developer integration of our Java services was identified as one of the big “wins” of the release. As the Director of Engineering said, “Without the developer productivity gains from Vagrant, we would not have been able to ship VirtualWisdom 4.0 when we did.”