Jenkins Build Slaves On A Budget

About half a year ago our team started working on a project with micro-service architecture, which means we had a lot of little applications to build as part of our delivery pipeline. One of the reasons why we opted to use this architecture was to gain the ability to replace a piece of component without having to rebuild the whole system, hence enabling faster feedback loop by releasing small chunks of changes in small parts of the system.

But there was one problem. Each application build was CPU-intensive, this includes fetching source, installing dependencies, unit testing, code coverage, integration testing, acceptance testing, packaging, publishing to repositories, and deploying to target environments. And nope, we don’t have a build farm!

Our goal was to have each application build to finish in less than a minute, which was fine when there’s only one build, but failed horribly when there were lots of changes triggering multiple downstream builds, often up to 15 builds at the same time with only 4 executors (4 CPUs) available on the build master. Scaling up wasn’t going to take us far, so we had to scale out and distribute the build system earlier than we normally would with past monolithic stonehenge projects.

We considered the idea of using the cloud, either an existing cloud CI solution or Amazon EC2, but we had to rule this out at the end due to extra cost, source code restriction, and network latency. One developer then suggested the idea of using the developer machines as build slaves, each one having 8 CPUs, SSD, lots of RAM and disk space, plenty of firepower lying around under-utilised most of the time.

So we gave it a go and it worked out really well. We ended up with additional 7×4 = 28 build executors, and it’s not unusual to have those 15 applications built at the same time and finished within a minute. Here’s our setup:

  • Each build slave has to self-register to the master because developer machines only have dynamic IP addresses, so they can’t be pre-configured on build master.
    This is where Jenkins Swarm Plugin comes in handy, allowing each slave to join the master, and the master doesn’t need to know any of the slaves beforehand.
  • Each build slave has to re-register when the machine is rebooted, we use upstart to do this.
  • Each build slave runs as its own user, separate from the developer’s user account. This allows a clean separation between user workspaces.
  • Each build slave is provisioned using Ansible, always handy when there are more build slaves to add in the future, or to update multiple build slaves in one go.
  • Each build slave is allocated 50% of the available CPUs on the machine to reduce any possibility of interrupting developer’s work.

So there, build slaves on a budget :).

I think it’s easy to overlook the fact that developer/tester machines are often under utilised, and that they would serve an additional purpose as Jenkins build slaves, a reasonable alternative before looking at other costlier solutions.

OSDC 2011

I went to Canberra this week to attend Open Source Developers Conference 2011 and also to give a talk titled Continuous Delivery Using Jenkins. OSDC ran for 3 days, and was held at Australian National University.

OSDC 2011 was very well organised, much thanks to the organisers: Evan Leybourn, Gavin Jackson, and the volunteers squad. It was an interesting grass roots conference with lots of passionate open source geeks, definitely learned a lot.

Slides from my talk:

Update (24/11/2011): and the video of the talk: