I wanted to write a continuous integration pipeline for the opensource project Duplicati. Before, only the project maintainer released binaries in semi-automated way using bash scripts that were tightly coupled with his own OS setup. PR’s for the project were build and tested using an inefficient Travis script. Main goals to improve on this situation:
- Reduce the total running time of the compiling and unit testing
- Remove any dependencies from the host OS of the pipeline
- Reduce unnecessary build steps
- Reduce unnecessary network traffic generated by the Travis VM’s
- Automate releasing builds
Especially the last point of automating releases or so-called nightly builds will help to shorten the feedback loop consisting of development, usage, issues and development. Ideally, issues can be fixed in a nightly build and users can test or use a fix right away.
The final build pipeline that came out can be seen here:
Eliminating overhead from preconfigured virtual machines
The first real issue with our build and test jobs on Travis was the enormous overhead for generating a build environment for each job. We used a default image called ‘csharp’ provided by Travis. It took about 8 minutes to set up a VM with a C# environment before even a build script was allowed to execute. Personally, I think this is a serious issue in the container set-up of Travis with respect to the C# language flavour and should be fixed. So instead of using the C# flavoured virtual machine, I opted for the one tagged as ‘minimal’.
With a minimal environment installed, the time needed to get to the build script took considerably less time: around 30 seconds. My first approach to install a mono environment was, in retrospect, a rather dumb idea: just apt-get all the packages required for building with mono. That soon came to an end after realizing the minimal image is based on Ubuntu 14.04 and many tools required like msbuild are simply not there.
The minimal image does however come with docker installed. My next approach was therefore to pull a recent mono docker image and execute the build scripts from within docker containers
There is another nice benefit using docker: it allows to execute the entire CI pipe line locally. Albeit locally each job is executed sequentially and on Travis some will run in parallel. Another benefit is that we are less dependent on Travis. The pipeline can be easily migrated to for example Jenkins. Apparently this technique is more often referred to as DinD, “docker inside docker”.
After dockerizing the build and test scripts, the next show stopper was that for any other job after following the build job, everything had to be rebuilt
Each job, as it is started in a separate Travis VM, is by default stateless. The stateless property forces you to design your pipeline in a clean way and separate each build step from others, but we still want to transfer data from one stage to the next.
Reading through various Github issues I stumbled upon using a cache in Travis. Usually this is done to transfer dependencies like inside an npm or gradle cache. A cache is setup per branch. Important to note is that to have a cache shared among jobs, the environment variables have to be identical. This wasn’t so clear from the documentation and it took me some time to realize changing the environment variables per job results in each job having its own cache – something we don’t want.
Almost immutable caches
With the build cache being shared among jobs, things looked considerably better. And it was time to solve the next issue: running multiple jobs invalidate or ruin the cache. For example, the test runner for unit tests appears to write output. Even if the outputs of each test would be disjoint from each other, having one cache for all jobs is not a good idea.
Considering the cache provided by the build stage as read-only and immutable, we copy the cache with an rsync –delete to a dedicated test working directory. The pipeline ended up with distinct caches for each stage and with no two jobs ever writing to the same cache. The non-sharing of working directories per stage, in particular of the caches, implies:
- no more resetting or removing previously generated files
- in local mode, it is possible to restart from a certain stage in the pipeline without having to go through all the previous stages again
- easier debugging (although it remains bash)
In the end I opted to alter caches that are supposed to be read-only by removing files that are no longer needed in upcoming stages. This change was necessary to prevent long waiting times with Travis’ caching or even worse that Travis times out while trying to store a cache.
Another not so nice issue with Travis is that parallel jobs cannot have more than one output directory. Due to the way Travis caching works, multiple jobs per stage that run in parallel are forced to have only one output, since per job a cache is archived and only the last cache archived survives to the next stage. A better way, or at least a way to would be convenient for this pipeline, would be to cache not only per job but also per output directory in the job. Then parallel jobs can write output in separate directories that would be stored in separate caches.
Other miscellaneous info
I think a pipeline to build and deliver software should consider its Git repository read only and should not alter the repository. Therefore, all logic to commit, tag, push was removed from the pipeline.
With with qemu static binaries, it is relatively easy to build cross platform docker images.