Switching to docker

Historically our server environment has been pretty basic and manual. The main focus in the company has been to build awesome copters, so managing the infrastructure for delivering services such as web, forum and wiki has been low priority. The services has been up and running most of the time, which after all is the most important property, but making changes has required a fair amount of manual work and has also been associated with some unknowns and risks. When you don’t feel safe, or if the procedure is not sufficiently simple we have a tendency to avoid doing stuff. The end result has been that we have not updated our web as often as we would have liked.

During the summer we are trying to catch up and clean out some of the technical debt we have left behind – including the infrastructure for the web. In this post I will outline what we are doing.

Goals

So, we want to simplify updates to our web, but what does that mean to us?

It should be dead simple to set up a development environment (or test environment) that has exactly the same behaviour as the production environment.
We want to know that when we make a change in a development environment, the exact same change will go into production – without any surprises.
After developing a new feature in your development environment, it should be ultra simple to deploy the change to production. In fact, it should be so simple that anyone can do it, and so predictable that it is boring.
We want to make sure our backups are working. Too many people did not discover that their backup procedures were buggy until their server crashed and had to be restored.

We want to move towards Continuous Delivery for all our systems and development, and these goals would be baby steps in the right direction.

Implementation

We decided that the first step would be to fix our web site, wiki and forum. The web is based on WordPress, the wiki on DocuWiki and the forum on PhpBB and they are all running on apache with a mysql database server on a single VPS . We wanted to stay on the VPS for now but simplify the process from development to production. We picked my new favourite tool: Docker.

Docker

Docker is by far the coolest tool I have used the last couple of years. I think it fundamentally will change the way we use and manage development-, test- and production environments, not only for the web or backend systems, but probably also for embedded systems. For anyone that wants to move in the direction of continuous delivery, docker is a must to try out.

So, what is Docker? If you don’t know, take a look at https://www.docker.com/

The typical workflow when creating new functionality could be

Make some changes to the codebase and commit the source code to your repository.
Build and run automated test
Build a docker image from the source code
Deploy the image in a test environment and run more tests
Deploy the image to production

Preferably steps 2 to 5 are automated and executed by a server.

In the docker world images are stored in a registry to be easily retrievable on any server for deployment. One part of the docker ecosystem is the public registry called Docker Hub where people can upload images for others to use. There are a lot of good stuff to use, especially the official images created by docker for standard applications such as Apache, MySql, PHP and so on. These images are a perfect staring point for your own images. In the workflow above we would push the image in step 2 to the registry and pull it in step 3 and 4.

Private registry

It is possible to push your images to the public Docker Hub but our images will contain information that we don’t want to share such as code and SSL certificates, so we needed a private registry. You can pay for that service in the cloud, but we decided to go for our own private registry as a start.

There is an official docker image that contains a registry that we used. Your registry requires some configuration, and you could create your own image from the official image + configuration. But, then where do you store the image? Luckily it is possible to run the registry image and pass the configuration as parameters at start up. The command to start the registry ended up something like this:

docker run -d --name=registry -p ${privateInterfaceIp}:5000:5000 -v ${sslCertDir}:/go/src/github.com/docker/distribution/certs -v ${configDir}:/registry -v ${registryStoragePath}:/var/registry registry:2.0 /registry/conf.yml

The default docker configuration is to use https for communication so we give the registry the path to our ssl certificate, the path to a configuration file and finally somewhere to store the files in our file system. All these paths are mapped into the file system of the container with the -v switch.

The configuration file conf.yml is located in the configDir and the important parts are:

version: 0.1
log:
  level: debug
  fields:
    service: registry
    environment: development
storage:
    filesystem:
        rootdirectory: /var/registry
    maintenance:
        uploadpurging:
            enabled: false
http:
    addr: :5000
    secret: someSecretString
    debug:
        addr: localhost:5001
    tls:
        certificate: /go/src/github.com/docker/distribution/certs/my_cert.fullchain
        key: /go/src/github.com/docker/distribution/certs/my_cert.key

The file my_cert.fullchain must contain not only your public certificate, but also the full chain down to some trusted entity.

Note: this is a very basic setup. You probably want to make changes for production.

Code vs data

A nice property with docker is that it is easy to separate code from data, they should basically go into separate containers. When you add functionality to your system, you create a new image that you use to create a container from in production. These functional containers have a fairly short lifecycle, they only live until next deploy. To create a functional image, just build your image from some base image with the server you need and add your code on top of that. Simply put, your image will contain both your server and application code, for instance Apache + WordPress with our tweaks.

When it comes to data there are a number of ways to handle it with docker and I will not go into a discussion of pros and cons with different solutions. We decided to store data in the filesystem of data containers, and let those containers live for a long time in the production environment. The data containers are linked to the server containers to give them access to the data.

In our applications data comes in two flavors: SQL database data and files in the filesystem. The database containers are based on the official mysql images while filesystem data containers are based on the debian image.

Backups

To get the data our of the data containers for backups, all we have to do is to fire up another container and link it to the data container. Now you can use the new container to extract the data and copy it to a safe location.

docker run --rm --volumes-from web-data -v ${PWD}:/backup debian cp -r /var/www/html/wp-content/gallery /backup

will start a debian container and mount the volumes from the “web-data” data container in the filesystem, /var/www/html/wp-content/gallery in this case. We also mount the current directory on the /backup directory in the container. Finally we copy the files from /var/www/html/wp-content/gallery (in the data container) to /backup, so they will end up in our local filesystem. When the copy is done the container will die and automatically removed.

Creating data containers

We need data containers to run our servers in development and test. Since we don’t have enormous amounts of data for these applications we simply create them from the latest backup. This gives us two advantages; first we can develop and test on real data, and secondly we continuously test our backups.

Development and production

We want to have a development environment that is as close to the production environment as possible. Our solution is to run the development environment on the same image that is used to build the production image. The base image must contain everything needed except the application, in our case Apache and PHP with appropriate modules and extensions. Currently the docker file for the base image looks like

FROM php:5.6-apache
RUN a2enmod rewrite
# install the PHP extensions we need
RUN apt-get update && apt-get install -y libpng12-dev libjpeg-dev && rm -rf /var/lib/apt/lists/* \
	&& docker-php-ext-configure gd --with-png-dir=/usr --with-jpeg-dir=/usr \
	&& docker-php-ext-install gd
RUN docker-php-ext-install mysqli
RUN docker-php-ext-install exif
RUN docker-php-ext-install mbstring
RUN docker-php-ext-install gettext
RUN docker-php-ext-install sockets
RUN docker-php-ext-install zip
RUN echo "date.timezone = Europe/Berlin" > /usr/local/etc/php/php.ini
CMD ["apache2-foreground"]

An example will clarify the concept

Suppose we have the source code for our wiki in the “src/wiki” directory, then we can start a container on the development machine with

docker run --rm --volumes-from=wiki-data -v ${PWD}/src/wiki:/var/www/html -p 80:80 url.to.registry:5000/int-web-base:3

and the docker file used to build the production image contains

FROM url.to.registry:5000/int-web-base:3
COPY wiki/ /var/www/html/
CMD ["apache2-foreground"]

For development the source files in src/wiki are mounted in the container and can be edited with your favourite editor, while in production they are copied to the image. In both cases the environment around the application is identical.

If we tagged the production image “url.to.registry:5000/int-web:3” we would run it with

docker run --rm --volumes-from=wiki-data -p 80:80 url.to.registry:5000/int-web:3

WordPress

WordPress is an old beast and some of the code (in my opinion) is pretty nasty. My guess is that it is mostly due to legacy and compatibility reasons. Any how this is something that has to be managed when working with it.

In a typical WP site, only a small portion of the codebase is site-specific, that is the theme. The rest of the code is WP itself and plugins. We only wanted to store the theme in the code repo and pull the rest in at build time. I found out that there are indeed other people that had the same problem and that they use the composer to do this (https://roots.io/using-composer-with-wordpress/). Nice solution! Now the code is managed.

Next problem is the data in the file system. WP writes files to some dirs in the wp-content directory, side by side with source code dirs and code pulled in with composer. There is no way of configuring these paths, but docker to the rescue! We simply created a docker data container with volumes exposed at the appropriate paths and mount it on the functional container. wp-content/gallery and wp-content/uploads must be writable and WP writes files here.

.
|-- Dockerfile
|-- composer.json
|-- composer.lock
|-- composer.phar
|-- index.php
|-- vendor
|-- wp
|-- wp-config.php
`-- wp-content
    |-- gallery
    |-- ngg_styles
    |-- plugins
    |-- themes
        |-- bitcraze
    `-- uploads

To create the data container for the filesystem and populate with data from a tar gz

docker create --name=web-data -v /var/www/html/wp-content/uploads -v /var/www/html/wp-content/gallery debian /bin/true
docker run --rm --volumes-from=web-data -v ${PWD}/dump:/dump debian tar -zxvf /dump/wp-content.tar.gz -C /var/www/html
docker run --rm --volumes-from=web-data debian chown -R www-data:www-data /var/www/html/wp-content/uploads /var/www/html/wp-content/gallery

To create the database container and populate it with data from a sql file

docker run -d --name web-db -e MYSQL_ROOT_PASSWORD=ourSecretPassword -e MYSQL_DATABASE=wordpress mysql
docker run -it --link web-db:mysql -v ${PWD}/dump:/dump --rm mysql sh -c 'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR" -P"$MYSQL_PORT_3306_TCP_PORT" -uroot -p"$MYSQL_ENV_MYSQL_ROOT_PASSWORD" --database="$MYSQL_ENV_MYSQL_DATABASE" < /dump/db-dump.sql'

Note that the wp-config.php file goes into the root in this setup.

Now we can start it all with

docker run --rm --volume=${PWD}/src/:/var/www/html/ --volumes-from=web-data --link=web-db:mysql -p "80:80" url.to.registry:5000/int-web-base:3

Wiki

DocuWiki is a bit more configurable and all wiki data could easily be moved to a separate directory (I used /var/wiki) by setting

$conf['savedir'] = '/var/wiki/data';

in conf/local.php. User data is moved by adding to inc/preload.php

$config_cascade['plainauth.users']['default'] = '/var/wiki/users/users.auth.php';
$config_cascade['acl']['default'] = '/var/wiki/users/acl.auth.php';

Create the data container

docker create --name wiki-data -v /var/wiki debian /bin/true

When data has been copied to the data container, start the server with

docker run --rm --volumes-from=wiki-data -v ${PWD}/src/wiki:/var/www/html -p 80:80 url.to.registry:5000/int-web-base:3

Forum

PhpBB did not hold any surprises. Directories that needed to be mapped to the data filesystem container are cache, files and store. Database container created in a similar way as for wordpress.

Reverse proxy

The three web applications are running on separate servers and to collect the traffic and terminate the https layer, we use a reverse proxy. The proxy is also built as a docker image, but I will not describe that here.

More interesting is how to start all docker containers and link them together. The simplest way is to use docker-compose and that is what we do. The docker-compose file we ended up with contains

proxy:
  image: url.to.registry:5000/int-proxy:2
  links:
    - web
    - wiki
    - forum
  ports:
    - "192.168.100.1:80:80"
    - "192.168.100.1:443:443"
  restart: always

web:
  image: url.to.registry:5000/int-web:3
  volumes_from:
    - web-data
  external_links:
    - web-db:mysql
  restart: always

wiki:
  image: url.to.registry:5000/int-wiki:1
  volumes_from:
    - wiki-data
  restart: always

forum:
  image: url.to.registry:5000/int-forum:1
  volumes_from:
    - forum-data
  external_links:
    - forum-db:mysql
  restart: always

To start them all simply

docker-compose up -d

Conclusions

All this work and the result for the end user is – TA-DA – exactly the same site! Hopefully we will update our web apps more often in the future, with full confidence that we are in control all the time. We will be happier Bitcraze developers, and after all that is really important.