Building Docker Images within Docker Containers via Jenkins

If you’re like me and you’ve Dockerized your build process by running your Jenkins builds from within dynamically provisioned Docker containers, where do you turn next? You may want the creation of any Docker images themselves to also happen within Docker containers. In order words, running Docker nested within Docker (DinD).

docker-meme

I’ve recently published a Docker image to facilitate building other Docker images from within Jenkins/Docker slave containers. Details at:

Why would one want to build Docker images nested within Docker containers?

  1. For consistency. If you’re building your JARs, RPMs, etc, from within Docker containers, it makes sense to use the same high-level process for building other artifacts such as Docker images.
  2. For Docker version freedom. As I mentioned in a previous post, the Jenkins/Docker plugin can be finicky with regards to compatibility with the version of Docker that you are running on your base OS. In order words, Jenkins/Docker plugin 0.7 will not work with Docker 1.2+, so if you really need a feature from a newer version of Docker when building your images you either have to wait for a fix from the Jenkins plugin author, or you can run Docker-nested-in-Docker with the Jenkins plugin-compatible Docker 1.1.x on the host and a newer version of Docker nested within the container. Yes, this actually works!
  3. This:

Docker + Jenkins: Dynamically Provisioning SLES 11 Build Containers

TL; DR

Using JenkinsDocker Plugin, we can dynamically spin-up SLES 11 build slaves on-demand to run our builds. One of the hurdles to getting there was to create a SLES 11 Docker base-image, since there are no SLES 11 container images available at the Docker Hub Registry. We used SUSE’s Kiwi imaging tool to create a base SLES 11 Docker image for ourselves, and then layered our build environment and Jenkins build slave support on top of it. After configuring Jenkins’ Docker plugin to use our home-grown SLES image, we were off and running with our containerized SLES builds!

Jenkins/Docker Plugin

The path to Docker-izing our build slaves started with stumbling across this Docker Plugin for Jenkins: https://wiki.jenkins-ci.org/display/JENKINS/Docker+Plugin. This plugin allows one to use Docker to dynamically provision a build slave, run a single build, and then tear-down that slave, optionally saving it. This is very similar in workflow to the build VM provisioning system that I created while working in VMware’s Release Engineering team, but much lighter weight. Compared to VMs, Docker containers can be spun up in milliseconds instead of a in few minutes and Docker containers are much lighter on hardware resources.

The above link to the Jenkins wiki provides details about how to configure your environment as well as how to configure your container images. Some high-level notes:

  • Your base OS needs to have Docker listening on a TCP port. By default, Docker only listens on a Unix socket.
  • The container needs run “sshd” for Jenkins to connect to it. I suspect that once the container is provisioned, Jenkins just treats it as a plain-old SSH slave.
  • In my testing, the Docker/Jenkins plugin was not able to connect via SSH to the containers it provisioned when using Docker 1.2.0. After trial and error, I found that the current version of the Jenkins plugin (0.6) works well with Docker 1.0-1.1.2, but Docker 1.2.0+ did not work with this Jenkins Plugin. I used Puppet to make sure that our Ubuntu build server base VMs only had Docker 1.1.2 installed. Ex:
    • # VW-10576: install docker on the ubuntu master/slaves
      # * Have Docker listen on a TCP port per instructions at:
      # https://wiki.jenkins-ci.org/display/JENKINS/Docker+Plugin
      # * Use Docker 1.1.2 and not anything newer. At the time of writing this
      # comment, Docker 1.2.0+ does not work with the Jenkins/Docker
      # plugin (the port for sshd fails to map to an external port).
      class { 'docker':
        tcp_bind => 'tcp://0.0.0.0:4243',
        version  => '1.1.2',
      }
  • There is a sample Docker/Jenkins slave based on “ubuntu:latest” available at: https://registry.hub.docker.com/u/evarga/jenkins-slave/. I would recommend getting that working as a proof-of-concept before venturing into building your own custom build slave containers. It’s helpful to be familiar with the “Dockerfile” for that image as well: https://registry.hub.docker.com/u/evarga/jenkins-slave/dockerfile/

Once you have the Docker Plugin installed, you need to go to your Jenkins “System Configuration” page and add your Docker host as a new cloud provider. In my proof-of-concept case, this is an Ubuntu 12.04 VM running Docker 1.1.2, listening on port 4243, configured to use the “evarga/jenkins-slave” image, providing the “docker-slave” label which I can then configure my Jenkins build job to be restricted to. The Jenkins configuration looks like this:

Jenkins' "System Configuration" for a Docker host

Jenkins’ “System Configuration” for a Docker host

I then configured a job named “docker-test” to use that “docker-slave” label and run a shell script with basic commands like “ps -eafwww”, “cat /etc/issue”, and “java -version”. Running that job, I see that it successfully spins up a container of “evarga/jenkins-slave” and runs my little script. Note the hostname at the top of the log, and output of “ps” in the screenshot below:

A proof-of-concept of spinning up a Docker container on demand

A proof-of-concept of spinning up a Docker container on demand

 

Creating Our SLES 11 Base Image

Having built up the confidence that we can spin up other people’s containers on-demand, we now turned to creating our SLES 11 Docker build image. For reasons that I can only assume are licensing issues, SLES 11 does not have a base image up on the Docker Hub Registry in the same vein as the images that Ubuntu, Fedora, CentOS, and others have available.

Luckily I stumbled upon the following blog post: http://flavio.castelli.name/2014/05/06/building-docker-containers-with-kiwi/

At Virtual Instruments we were already using Kiwi to build the OVAs of our build VMs, so we were already familiar with using Kiwi. Since we’d already been using Kiwi to create the OVA of our build environment it wasn’t much more work to follow that blog post and get Kiwi to generate a tarball that could be consumed by “docker import”. This worked well for the next proof-of-concept phase, but ultimately we decided to go down another path.

Rather than have Kiwi generate fully configured build images for us, we decided it’d be best to follow the conventions of the “Docker Way” and have Kiwi generate a SLES 11 base image which we could then use with a “FROM” statement in a “Dockerfile” and install the build environment via the Dockerfile. One of the advantages of this is that we only have to use Kiwi to generate the base image the first time. After there we can stay in Docker-land to build the subsequent images. Additionally, having a shared base image among all of our build image tags should allow for space savings as Docker optimizes the layering of filesystems over a common base image.

Configuring the Image for Use with Jenkins

Taking a SLES 11 image with our build environment installed and getting it to work with the Jenkins Docker plugin took a little bit of work, mainly spent trying to configure “sshd” correctly. Below is the “Dockerfile” that builds upon a SLES image with our build environment installed and prepares it for use with Jenkins:

# This Dockerfile is used to build an image containing basic
# configuration to be used as a Jenkins slave build node.

FROM vi-docker.lab.vi.local/pa-dev-env-master
MAINTAINER Dan Tehranian <REDACTED@virtualinstruments.com>


# Add user & group "jenkins" to the image and set its password
RUN groupadd jenkins
RUN useradd -m -g jenkins -s /bin/bash jenkins
RUN echo "jenkins:jenkins" | chpasswd


# Having "sshd" running in the container is a requirement of the Jenkins/Docker
# plugin. See: https://wiki.jenkins-ci.org/display/JENKINS/Docker+Plugin

# Create the ssh host keys needed for sshd
RUN ssh-keygen -A

# Fix sshd's configuration for use within the container. See VW-10576 for details.
RUN sed -i -e 's/^UsePAM .*/UsePAM no/' /etc/ssh/sshd_config
RUN sed -i -e 's/^PasswordAuthentication .*/PasswordAuthentication yes/' /etc/ssh/sshd_config

# Expose the standard SSH port
EXPOSE 22

# Start the ssh daemon
CMD ["/usr/sbin/sshd -D"]

Running a Maven Build Inside of a SLES 11 Docker Container

Having created this new image and pushed it to our internal docker repo, we can now go back to Jenkins’ “System Configuration” page and add a new image to our Docker cloud provider. Creating a new Jenkins “Maven Job” which utilizes this new SLES 11 image and running a build, we can see our SLES 11 container getting spun up, code getting checked out from our internal git repo, and Maven being invoked:

Hooray! A Successful Maven Build

Hooray! A successful Maven build inside of a Docker container!

Output From the Maven Build. LGTM!

Output from the Maven Build that was run in the container. LGTM!

 

Wins

There are a whole slew of benefits to a system like this:

  • We don’t have to run & support SLES 11 VMs in our infrastructure alongside the easier-to-manage Ubuntu VMs. We can just run Ubuntu 12.04 VMs as the base OS and spin up SLES slaves as needed. This makes testing of our Puppet repository a lot easier as this gives us a homogeneous OS environment!
  • We can have portable and separate build environment images for each of our branches. Ex: legacy product branches can continue to have old versions of the JDK and third party libraries that are updated only when needed, but our mainline development can have a build image with tools that are updated independently.
    • This is significantly better than the “toolchain repository” solution that we had at VMware, where several 100s of GBs of binaries were checked into a monolithic Perforce repo.
  • Thanks to Docker image tags, we can tag the build image at each GA release and keep that build environment saved. This makes reproducing builds significantly easier!
  • Having a Docker image of the build environment allows our developers to do local builds via their IDEs, if they so choose. Using Vagrant’s Docker provider, developers can spin up a Docker container of the build environment for their respective branch on their local machines, regardless of their host OS – Windows, Mac, or Linux. This allows developers to build RPMs with the same libraries and tools that the build system would!

A Local Caching Proxy for “pypi.python.org” via Docker

TL; DR

If your infrastructure automation installs packages from PyPI (the Python Package Index) via “pip” or similar tools, you can save yourself from annoying “pypi.python.org timeout” errors by running a local caching proxy of the PyPI service. After trying several of these services, I found “devpi” to be the most resilient. It’s available as both a Python package or as a Docker container that you can run in your data center.

Problem

If you have infrastructure automation that tries to install packages from PyPI then you’ve undoubtedly encountered encounded availability issues with the PyPI web service, hosted at “pypi.python.org”. Example email alerts that we see from our Puppet infrastructure look like:

Tue Sep 02 23:47:23 -0700 2014 /Stage[main]/Jenkins::Dev_jenkins_slave/Package[jenkins-check-for-success]
(err): Could not evaluate: Could not get latest version: 
Timeout while contacting pypi.python.org: execution expired

One could choose to ignore these sorts of connectivity issues since they are transient, but there’s quite a few negative consequences to that:

  • If you’re spinning up new machines on-demand and they require a Python package as part of their configuration, then your ability to consistently spin up these machines successfully has become compromised by a dependency which is completely out of your own control.
  • If your infrastructure automation is configured to send email alerts on these types of errors, you’ll be getting un-actionable emails that add to the noise of email alerts that you get from your infrastructure. This effectivly makes your alerting system less valuable as your team will be trained to ignore their email alerts.
  • As you scale your infrastructure to hundreds or thousands of nodes, you’ll be receiving a lot of alerts about connectivity issues with “pypi.python.org” throughout the day and night time hours. When “pypi.python.org” goes down hard for a prolonged period of time, you’ll end up with all of the nodes in your infrastructure simultaneously bombarding you with alerts about not being able to contact “pypi.python.org”.

Solution

The solution for this problem is to run a local caching proxy for “pypi.python.org” within your data center. The Python community has developed proxy packages like pypiserver, chishop, devpi, and others specifically for this use case. After extensive research and trying several of them out, I’ve found devpi to be the most resilient as well as the most actively developed as of this writing.

One can either install devpi as a Python package (see instructions on their website) or via a Docker container. Since our infrastructure has been making the move to “All Docker Everything” I’ll write up the steps I took to setup the Docker container running devpi and how I configured our clients to use it.

Devpi Server Installation

Here’s some sample Puppet code for how to download & run the “scrapinghub/devpi” container with an nginx proxy in front of it. (I discussed why having an nginx proxy in front is advantageous in Private Docker Registry w/Nginx Proxy for Stats Collection)

You’ll want to change “DEVPI_PASSWORD” and the hostname for the Nginx vhost below.

# devpi server & nginx configuration
docker::image { 'scrapinghub/devpi': }
docker::run { 'devpi':
    image => 'scrapinghub/devpi',
    ports => ['3141:3141',],
    use_name => true,
    env => ['DEVPI_PASSWORD=1234',],
}

nginx::resource::upstream { 'pypi_app':
    members => ['localhost:3141',],
}
nginx::resource::vhost { 'vi-pypi.lab.vi.local':
    proxy => 'http://pypi_app',
}

Once your container is running you can run “docker logs” to see what it is up to. You can see the “devpi” proxy saving your bacon when “pypi.python.org” occasionally becomes unavailable via log statements like this:

172.17.42.1 - - [22/Jul/2014 22:12:09] "GET /root/public/+simple/argparse/ HTTP/1.0" 200 4316
2014-07-22 22:12:09,251 [INFO ] requests.packages.urllib3.connectionpool: Resetting dropped connection: pypi.python.org
2014-07-22 22:12:09,301 [INFO ] devpi_server.filestore: cache-streaming: https://pypi.python.org/packages/source/a/argparse/argparse-1.2.
1.tar.gz, target root/pypi/+f/2fb/ef8cb61e506c706957ab6e135840c/argparse-1.2.1.tar.gz
2014-07-22 22:12:09,301 [INFO ] devpi_server.filestore: starting file iteration: root/pypi/+f/2fb/ef8cb61e506c706957ab6e135840c/argparse-1.2.1.tar.gz (size 69297)

Python Client Configuration

On the client side we need to configure both “pip” and “easy_install” to use the devpi container we just instantiated. This requires creating a special configuration file for each of those Python package managers. The configuration file tells those package managers to use your devpi proxy server for their package index URL.

You’ll want to change the URL to point to the hostname you use within your own infrastructure.

# ~/.pip/pip.conf

[global]
index-url = http://vi-pypi.lab.vi.local/root/public/
# ~/.pydistutils.cfg

[easy_install]
index_url = http://vi-pypi.lab.vi.local/root/public/

But Wait There’s More – Uploading Your Own Python Packages

One of the additional benefits to running a local PyPI proxy is that it becomes a distribution point for your private Python packages. Instead of clumsily checking out SCM repos full of your own custom Python scripts to each machine in your infrastructure, you can install your Python scripts as first-order Python packages, the same way you would install packages from PyPI. This lets you properly version your packages and define dependency requirements between your packages.

Creating a “setup.py” file for each of your Python projects is outside the scope of this post, but details can be found online. Once your Python project has its “setup.py” file, uploading your versioned package to your local devpi instance requires just a few simple commands. From our Jenkins job which publishes a new version of a Python package upon a git push:

# from the cwd of "setup.py"
devpi use http://vi-pypi.lab.vi.local/root/public/
devpi login root --password 1234 
devpi upload

More details at: http://doc.devpi.net/latest/quickstart-releaseprocess.html

Conclusion

By running a local caching proxy of “pypi.python.org” we’re able to improve the reliability of our infrastructure because we are no longer beholden to the availability of an external dependency. We also get the added benefit of having a proper Python package distribution point, which allows us to have better development & deployment practices. Finally, this local caching proxy provides better performance for installing packages, as local network copies are significantly faster than downloading from an external website.

RELENG 2014 Wrap-Up

I was a speaker at the 2nd International Workshop on Release Engineering @ Google HQ last week.  I enjoyed meeting up with other release engineers and sharing ideas with them.  Here are some talks I enjoyed:

Finally, my own presentation went quite well.  There were a lot of people in the audience that came up to chat with me afterwards and it seemed that the message really resonated with the audience (confirmation bias, much? : ).  From the sounds of it, I may have an opportunity to give an extended-version of this talk at Google and VMware in the near future which is great because there were several ROI examples and software industry anecdotes around quality and time-to-market that I had to cut out to meet the time requirements.

Here are the slides I used: