A commonly asked question by folks that are deploying their first Docker containers into production is, “How should I get application configuration into my Docker container?” The configuration in question could be settings like the number of worker processes for a web service to run with, JVM max heap size, or the connection string for a database. The reality is that there are several standard ways to do this, each with their own pros and cons.
The ~3.5 Ways to Send Configuration to your Dockerized Apps
1. Baking the Configuration into the Container
Baking your application’s configuration into a Docker image is perhaps the easiest pattern to understand. Basically one can use commands within the “Dockerfile” to drop configuration files into the right places via the Dockerfile’s COPY directive, or modify those configuration files at image build time with “sed” or ”echo” via the RUN command.
If there’s a container available on the Docker Hub Registry that does everything you want save for one or two config settings, you could fork that “Dockerfile” on GitHub, make modifications to the “Dockerfile” in your GitHub fork to drop in whatever configuration changes you want, then add it as a new container on the Docker Hub Registry.
That is what I did for:
- https://github.com/tehranian/docker-atlassian-jira
- https://registry.hub.docker.com/u/tehranian/docker-atlassian-jira
Pros:
- You get parity between development & production environments because they will both be using the same configuration that is static within the container image.
Cons:
- Since the configuration is baked into the image, any future configuration changes will require additional modifications to the image’s build file (its “Dockerfile”) and a new build of the container image itself.
2a. Setting the Application Configuration Dynamically via Environment Variables
This is a commonly-used pattern for images on the Docker Hub Registry. For an example, see the environment variables “POSTGRES_USER” and “POSTGRES_PASSWORD” for the official PostgreSQL Docker image.
Basically, when you “docker run” you will pass in pre-defined environment variables like so: "docker run -e SETTING1=foo -e SETTING2=bar ... <image name>"
. From there, the container’s entry point (startup script) will look for those environment variables, and “sed” or “echo” them into whatever relevant config files the application uses before actually starting the app.
It’s worth mentioning that the container’s entry point script should contain reasonable defaults for each of those environment variables if the invoker does not pass those environment variables in, so that the container will always be able to start successfully.
Pros:
- This approach makes your container more dynamic in terms of configuration.
Cons:
- You are sacrificing dev/prod parity because now folks can configure the container to behave differently in dev & prod.
- Some configurations are too complex to model with simple key/value pairs, for example an nginx/apache virtual host configuration.
2b. Setting the Application Configuration Dynamically via Environment Variables
This is a similar idea to using environment variables to pass in configuration, but instead the container’s startup script will reach out to a key-value (KV) store on the network like Consul or etcd to get configuration parameters.
This makes it possible to do more complex configurations than is possible with simple environment variables, because the KV store can have a hierarchical structure of many levels. It’s worth noting that widely-used tooling exists for grabbing the values from the KV store substituting them into your config files. Tools like confd even allow for automatic app-reloading upon changes to the KV configuration. This allows you to make your app’s configuration truly dynamic!
See:
- https://github.com/kelseyhightower/confd
- https://github.com/hashicorp/consul-template
- https://github.com/hashicorp/envconsul
Pros:
- This approach makes your container more dynamic in terms of configuration.
- The KV store allows for more complex & dynamic configuration information
Cons:
- This introduces an external dependency of the KV store, which must be highly-available.
- You are sacrificing dev/prod parity because now folks can configure the container to behave differently in dev & prod.
3. Map the Config Files in Directly via Docker Volumes
Docker Volumes allow you to map any file/directory from the host OS into a container, like so: “docker run -v <source path>:<dest path> ...”
Therefore if the config file(s) for your containerized app happened to be available on the filesystem of the base OS, then you could map that config file (or dir) into the container. Ex:
“docker run -v /home/dan/my_statsd_config.conf:/etc/statsd.conf hopsoft/graphite-statsd”
Pros:
- You don’t have to modify the container to get arbitrary configurations in.
Cons:
- You lose dev/prod parity because now your app’s config can be anything
- If you’re doing this in production, now you have to get that external config file onto the base OS for sharing into the container (a configuration management tool like Ansible, Chef, or Puppet comes in handy here)
Conclusion
As you can see, there are many potential ways to get application configuration to your Dockerized apps, each with their own trade-offs. Which way is best? It really depends on how much dynamism you require and whether or not you want the extra burden of managing dependencies like a KV store.
Pingback: 1p – How Should I Get Application Configuration into My Docker Containers? | Profit Goals
Pingback: 1p – How Should I Get Application Configuration into My Docker Containers? – Exploding Ads
Nice article Dan. Love that you boiled it down and included the pros and cons.
Alternatively, you can also use Tiller (http://github.com/markround/tiller) which provides a standardized, easy to use way of generating configuration files. Many containers already use it, as it can generate configuration files from a variety of sources: environment variables; a HTTP webservice; zookeeper cluster; or even just plain YAML files on disk. I wrote it specifically so that you can build a single Docker container, but ship configuration files for different environments (or let users specify parameters at run time). As it has a plugin architecture, you could say it sort of combines points 1 & 2 in your examples above.
I think you’re taking “dev/prod parity” a little too far. Transient configurables like names and addresses don’t have to be the same between dev and prod; the more important idea is that the the tools and infrastructure are as close as possible. Moreover, you do not want to be baking sensitive information like the password to your production database into a container image. Otherwise, you will have to lock down the custody chain for the image, which is almost certainly going to create more problems than it solves.
agree 100%, the app will always have its own DB and passwords and you will have to deal with it anyway, that parity is only for Demos and Hello World apps
I appreciate that most software is configured by configuration files located somewhere distinctly separate for that application (e.g. /etc/) but what if the configuration is in a database that is modified after the container starts?
Nice and clear post, thank you.
Nice article.
But where are you going to put sensitive date, such as password, ssh keys, … ?
Nice and concise. I note that where those variables might be secret is not covered in terms of pros and cons. It does convey that the problem of managing variable configuration has not gone away with the magic container bullet. Really practical way to get started with the actual use case though, thank you.
Similar to Mark Dastmalchi-Round’s project (which I just found out about), you can use my project: https://github.com/autonomy/alterant.git.
I’m still new to Docker but not to separation of environments. It sometimes can simply come down to how you’re environments are configured. eg.
1. your dev environment domain is dev.domain.com, your prod is prod.domain.com
2. your /etc/resolv.conf has search for those domains in each environment.
3. your apps now just request db connections to : db, db.prod.domain.com for production, db.dev.domain.com for dev.
this way configs are the same but where they end up is different. some other things can be done this way too but this may not work for everything.
CNAME->CNAME->HOST is also another trick. eg
app-db.global.domain.com -> cluster-db-1.au1.domain.com -> db-1.hosts.domain.com
if the app-db needs to change, change the pointer to a different cluster. if app the apps using this db has to move, you move the cluster-db-1 to point to the new db-X back end.. all without changing the application.
Is there a 4th option? Is it possible to put your configuration settings or files into another “File” container and just link the application container to the configuration container?
I can come up with another option which would be to get configuration files when starting the container and put them to where the app being deployed expects them. I’m just about to go and try that. My intention is to use some tool to provide me with HTTP API to get configuration files (I’m thinking of Spring Cloud Config Server since it supports Git as base source of configuration and can provide whole files not just key-value, so also certificates can come from there, or can use Vault and access them the same way but the first password needs to come in somehow).
Then I can get them within the ‘boostrap.sh’ script I’ll call as the image entrypoint. So no configuration in the image or on the host. No configuration in the compose or in some env properties files supplied to the docker-compose up. But I haven’t seen this approach much over the internet so far.
Missing from 2a. cons is that environment variable are only set for default Docker user. Sometimes you need multiple users in the same container and those users will not see the environment variables.
Thanks for the article.
This article answers the question of how and where to read the config values from.
But I have a different question. How do we load config values to KVStore dynamically in the first place?
Pingback: How Should I Get Application Configuration into my Docker Containers? – program faq
Nice post however you failed to mention the major con to using environment variables to pass information into the build process. Specifically regarding usernames and passwords – depending on the container/application type being built (common with web based containers) any environment variables that get passed in will be accessible and visible publicly! For example passing USER_NAME=someuser to a php-apache build – if you where to inspect the php info you would see USER_NAME someuser listed in the environment variables. In which case if you’ve developed the Dockerfile for the build and want to make it easily adaptable for others to configure or re-purpose with their own user names and passwords during build I would recommend using arguments – they work almost identically to how your describing the use of environment variables and there is no risk of them being publicly displayed. And, if you wanted you could use arguments to also pass values to environment variables as well (if you design/develop your Dockerfile to utilize them in that manor.
Fully agree the easiest way to set a configuration file with specific settings if the build/Dockerfile is not yours or easy to add to is definitely with a volume map (and if you know a file should exist in a container being built at a specific location you can map a volume with your own file for pretty much anything which makes modifying existing projects to your own needs that much easier and another bonus in Dockers favor if you ask me.
Pingback: Dockerize Java Play App – Flying Thots
Hi, nice article! It’s really useful. Thanks.