comment 0

Human-Readable Ansible Playbook Log Output Using Callback Plugin

One problem I’ve had with Ansible playbook since its early 0.x days is with its verbose log output. Jsonified by default, it’s hard to read, and pretty much impossible for a human to review when its stdout or stderr contains tens/hundreds of lines combined into one lengthy string.

Here’s how it looks like:

changed: [gennou.local] => {"changed": true, "cmd": "/tmp/",
"delta": "0:00:00.019164", "end": "2014-03-30 21:05:33.994066", "rc": 0,
"start": "2014-03-30 21:05:33.974902", "stderr": "", "stdout": "gazillion
texts here with lots of \n in between gazillion texts here with
lots of \n in between gazillion texts here with lots of \n
in between gazillion texts here with lots \n in between"}

When –verbose flag is set, I believe that the intention is for a human to eventually review the verbose log output. And whenever the human did review the log, the person never failed to tell me that the jsonified message was impossible to read, to which I replied with “They will fix it someday.”

Well, Ansible is now at version 1.x and the problem is still there.

So, while we continue on waiting, the workaround I use for now is to set up an Ansible callback plugin that listens to some task events, and then logs the result in a human readable format, with each field on its own line and newline stays as-is.

Here’s how I set it up:

  1. Set callback plugins directory in Ansible configuration file (ansible.cfg file):
    callback_plugins = path/to/callback_plugins/
  2. Create a callback plugin file at path/to/callback_plugins/ directory, I call mine .
    Here’s the callback plugin gist:
  3. Run ansible-playbook command:
    ansible-playbook -i hosts playbook.yml

And the log output looks like this:


2014-03-30 21:05:33.974902

2014-03-30 21:05:33.994066


gazillion texts here with lots of
in between
gazillion texts here with lots of
in between
gazillion texts here with lots of
in between
gazillion texts here with lots of
in between

Now that’s more readable.

You can set the callback plugin on each Ansible project if you want to. But I set mine as part of my CI/Jenkins boxes provisioning, that way all Jenkins jobs that execute an Ansible playbook end up with a readable log output.

Note: I know that some people suggest using debug module to split output into multiple lines. However, having to add register and debug fields all over the tasks would easily clutter the playbook. I find the callback plugin to be a cleaner and simpler solution.

comments 2

Roombox – Node Knockout 2013

A few weeks ago I participated in Node Knockout 2013 (NKO4), a 48-hour hackathon with 385 teams competing for the top spot in 7 categories (team, solo, innovation, design, utility/fun, completeness, and popularity).

And here’s a video of what I hacked: Roombox, a Roomba vacuum cleaner turned into a boombox using node.js . This demo shows the Roomba playing Rocky theme, Beverly Hills Cop theme, Hey Jude (The Beatles), Scar Tissue (Red Hot Chilli Peppers), Super Mario Bros. theme, and Airwolf theme.


Note: I put the wrong year for The Beatles’ Hey Jude in the video. I wanted to fix it, but it was already 1 am back then and I had to go to work in the morning. Sorry Beatles fans!

The result? Roombox finished 9th in innovation category, and 14th in solo category. Not bad for an idea that I improvised on the D-day itself. If there’s a solo innovation category, Roombox would’ve finished 1st on that inexistent leaderboard :).

Comments from some judges and fellow contestants:

Cool hack! I’m also amused by the rickroll fail :)

Hah now I need to get a Roomba. Great hardware project / hack.

This got innovation points for me as it never would have occurred to me to do this. Made me laugh and share with others.

Most out-of-the-world idea on NKO :D

Completely useless but very innovative!

I would have given you 5 stars on innovation, but I once heard a hard drive play Darth Vader’s theme song so there is a precedent.

How does Roombox work? To put it simply, Roombox parses abc notation sheets, maps the music notes to fit Roomba notes range, splits each song into 4 segments where each segment would be registered to a Roomba slot, then finally the Roomba is instructed to play the song. Most of the development effort was spent on finding a suitable music format, and on testing the music sheets because in reality only few songs would sound decent on a vacuum cleaner.

Here’s a sketch I scribbled after deciding on how I would hack Roombox:

Huge thanks to Mike Harsch for writing mharsch/node-roomba, and Sergi Mansilla for writing sergi/abcnode. And an apology to my wife and brother for suffering through the weekend listening to dozens of horrible songs being tested :p.

Update (08/12/2013):

DBrain told me about DJ Roomba from Parks and Recreation. If iRobot ever upgraded Roomba’s sound system, Roombox code would be totally useful to achieve ‘music player on a moving vacuum cleaner’ a la DJ Roomba.

comment 0

NodeUp 53: NodeUp Listeners On NodeUp

About a month ago, I joined D-Shaw, Nizar Khalife, Erik Isaksen, and Matt Creager on NodeUp 53 where we discussed about NodeUp podcast and node.js community from NodeUp listeners point of view, and I also talked a bit about Australia, kangaroos, and node. Thanks to Rodd Vagg for pinging me about this particular episode.

Recording the show itself was an interesting experience :). For one, it started at 4am Melbourne EST. I totally missed the two alarms I set up, and was finally awaken by my mobile’s push notification alert from dshaw’s tweet telling me to accept the Skype invitation about two minutes before 4. Ran down the stairs, head spun a bit for the first hour lol.

Here’s the transcription of NodeUp 53 thanks to Noah Collins. I made a mistake where I thought I said that Flickr Photo migrated to node.js as davglass tweeted, but I actually said Facebook Photo on the show. It should be Flickr Photo. My bad, I’m sorry folks.

comment 0

An Old Dryer, A Watts Clever, and A Ninja Blocks

This was another quick weekend hack to fix my old dryer’s busted timer problem (busted timer = having to stay around when it’s time to switch off the dryer).

Step one was to use Watts Clever Easy-off Remote Control Socket which allowed me to switch the power on and off remotely. This product comes with a remote control which saved me from having to get out of the house to get to the garage during winter. But that’s not all…

Step two was to program the socket on a Ninja Blocks, which gave remote control ability via the web. This allowed me to turn off the dryer all the way from my office.

Step three was to write a node.js script that talks to Ninja Blocks which in turn switches the power socket on and off. This script was then executed from a scheduled Jenkins job.

Voila, the old dryer had a new timer, albeit a long-winded one :p.

comment 0

Monitor Jenkins From The Terminal

Here’s how I’ve been monitoring my Jenkins setup…

A combination of Nestor + watch + Terminator » one view for monitoring failing builds, one view for executors status, and one view for job queue. A summary of Jenkins status info on a small screen estate that I can place at the corner of my workspace.

If you want to set up something similar, here are the commands: (assume JENKINS_URL is already set)

  • watch -c “nestor dashboard | grep FAIL”
  • watch nestor executor
  • watch nestor queue
comment 0

DataGen Workers Optimisation

I released DataGen v0.0.9 during lunch break yesterday. This version includes the support to limit how many workers can run concurrently, which is something that I’ve always wanted to add since day one. I finally got the time to do it last weekend, and it turned out to be an easy task thanks to Rod Vagg‘s worker-farm module.

Why is this necessary?

The problem with previous versions of DataGen was that when you want to generate 20 data files, then 20 worker processes will be created and run concurrently. It’s obviously not a great idea to have 20 processes fighting over 2 CPUs.

With v0.0.9, you can specify this limit using the new -m/–max-concurrent-workers flag: (if unspecified, it will default to the number of CPUs)

datagen gen -w 20 -m 2

When I first wrote about DataGen last year, I mentioned that I still needed to run some tests to verify my assumption about the optimal number of workers. So here it is one year later…

The first test is on a Linux box with 8 cores, where each data file contains 500,000 segments, each segment contains a segment ID, 6 strings, and 3 dates.

The second test is on an OSX box with 2 cores, where each data file contains 500,000 segments, but this time each segment only contains a segment ID.

As you can see, the performance is almost always best when the concurrent running worker processes are  limited to the number of available CPUs (8 max concurrent workers on the first chart, and 2 on the second chart).

When you specify 20 workers and your laptop only has 2 CPUs, only 2 workers will generate the data file concurrently at any time, and you can be sure that it will be faster than having 20 workers generating 20 data files at the same time. And that’s why DataGen’s default setting allows as many concurrent workers as the available CPUs.

comment 0

Jenkins Build Slaves On A Budget

About half a year ago our team started working on a project with micro-service architecture, which means we had a lot of little applications to build as part of our delivery pipeline. One of the reasons why we opted to use this architecture was to gain the ability to replace a piece of component without having to rebuild the whole system, hence enabling faster feedback loop by releasing small chunks of changes in small parts of the system.

But there was one problem. Each application build was CPU-intensive, this includes fetching source, installing dependencies, unit testing, code coverage, integration testing, acceptance testing, packaging, publishing to repositories, and deploying to target environments. And nope, we don’t have a build farm!

Our goal was to have each application build to finish in less than a minute, which was fine when there’s only one build, but failed horribly when there were lots of changes triggering multiple downstream builds, often up to 15 builds at the same time with only 4 executors (4 CPUs) available on the build master. Scaling up wasn’t going to take us far, so we had to scale out and distribute the build system earlier than we normally would with past monolithic stonehenge projects.

We considered the idea of using the cloud, either an existing cloud CI solution or Amazon EC2, but we had to rule this out at the end due to extra cost, source code restriction, and network latency. One developer then suggested the idea of using the developer machines as build slaves, each one having 8 CPUs, SSD, lots of RAM and disk space, plenty of firepower lying around under-utilised most of the time.

So we gave it a go and it worked out really well. We ended up with additional 7×4 = 28 build executors, and it’s not unusual to have those 15 applications built at the same time and finished within a minute. Here’s our setup:

  • Each build slave has to self-register to the master because developer machines only have dynamic IP addresses, so they can’t be pre-configured on build master.
    This is where Jenkins Swarm Plugin comes in handy, allowing each slave to join the master, and the master doesn’t need to know any of the slaves beforehand.
  • Each build slave has to re-register when the machine is rebooted, we use upstart to do this.
  • Each build slave runs as its own user, separate from the developer’s user account. This allows a clean separation between user workspaces.
  • Each build slave is provisioned using Ansible, always handy when there are more build slaves to add in the future, or to update multiple build slaves in one go.
  • Each build slave is allocated 50% of the available CPUs on the machine to reduce any possibility of interrupting developer’s work.

So there, build slaves on a budget :).

I think it’s easy to overlook the fact that developer/tester machines are often under utilised, and that they would serve an additional purpose as Jenkins build slaves, a reasonable alternative before looking at other costlier solutions.

comment 0


I attended CITCON 2013 in Sydney last February. This year’s sessions covered more non-technical issues compared to CITCON 2010. Two of the more interesting topics for me were on how devops movement could potentially discourage collaboration, and on how large non-tech companies try and still fail to implement continuous delivery.

Those were some of the problems that I’ve been battling for many years. In an organisation where dev and ops are two separate divisions, devops is often a shortcut for dev to do ops tasks while bypassing any ops involvement. Instead, a better alternative would be for dev and ops teams to collaborate and stop fighting over issues like root access.

As for the second topic, continuous delivery is sometimes not as straightforward as it seems. One major obstacle to continuous delivery implementation is a conservative change management process. No matter how you automate your delivery pipeline along your development, test, and staging environments, it would all be useless if production deployment requires manual approval for the sake of auditing.

Technology is often the easier part, the harder part is on people and policies, on changing a culture, on accepting new ideas.

The best part of CITCON has always been its open space format where ideas/opinions/experiences flow during the discussions. And like most tech conferences, the hallway discussions were not to be missed. The quote-of-the-conf went to Jeffrey Fredrick for pointing out that (my interpretation of what he said) technologists often suck for focusing on the technology to sell to the business (e.g. continuous delivery is awesome), instead of focusing on the business problem and how the technology can solve it (e.g. business problem is time to market, continuous delivery can help).

I also caught up with Michael Neale from CloudBees there, here’s his CITCON 2013 notes, along with some familiar faces from CITCON 2010.

comment 0

Introducing Repoman

Q: How do you clone 30 repositories from your personal GitHub accounts and 150 repositories from your organisation GitHub accounts in just one line?

A: repoman --github-user myuser1,myuser2 --github-org myorg1,myorg2 config && repoman init

Q: How do you execute a set of commands against all repositories in just one line?

A: repoman exec 'git stash && git pull --rebase && git stash apply'

I wrote Repoman back in 2011 and I’ve been using it ever since. It was my solution to resolve the annoyances involved with working on multiple machines, multiple OSes, multiple SCMs, and multiple repositories that depend on each other.

Repoman works against a list of repositories listed in .repoman.json file. You can use repoman config to generate a sample file, or add --github-user / --github-org flags to generate a list of GitHub repositories. This .repoman.json file can be placed in either the user home directory or the current directory (your workspace). The rest of Repoman commands like init, get, exec, etc, can then be run from that workspace directory.

Problem: switching between multiple laptop and desktop machines.

After working with multiple machines for a while, I ended up with some repositories existing on only some of the machines, never on all of them. And when I had to use a different machine, then I had to manually clone the repositories that don’t yet exist on that machine. One by one.

With Repoman, I only needed to maintain a .repoman.json file containing all repositories that I worked on, and stored it on a remote repository, then clone it over to all machines. From then on, I could simply repoman init to clone all repositories and repoman get to make sure I have the latest code of all repositories on each machine.

Problem: identifying unfinished changes.

Sometimes I code on the train, on the way to and from work. The thing about coding on the train is that often I had to stop not when I finished a piece of change, but when I arrived at my destination. This resulted in unfinished changes across several repositories on the machine that I used at the time, and I often forgot about those changes until the next time I worked on those repositories again.

With Repoman, I built a habit of running repoman changes to identify unfinished changes before working on anything else.

Problem: working with Git and Subversion repositories.

I had some repositories hosted on GitHub, Gitorious, Bitbucket, and Google Code. This of course meant that I had to switch between Git and Subversion commands.

With Repoman, I only needed to run its simple commands repoman init | get | changes | save | undo, which covers the majority of my coding activities (note: Repoman does not aim to cover all Git and Subversion commands). Those commands are mapped to its Git or Subversion equivalent accordingly.

Problem: executing a custom command on all repositories.

This used to annoy me so much. I had a number of repositories and from time to time I had to add the same file to all of them, let’s say a .travis.yml file or a .gitignore file.

With Repoman, I just needed create the file once at /tmp/file, then run repoman exec ‘cp /tmp/file . && git commit -am “Add file” && git pull –rebase && git push’. Voila, all repositories had the new file.

Problem: grouping repositories by project.

I often had to switch between projects, where each project consisted of several repositories. When I worked on a particular project, I would like to update its repositories to the latest. Ditto when I moved to the next project.

With Repoman, I created a config file for each project, e.g. .project1.json and .project2.json . Then I symlink-ed .repoman.json to the project I work on. Or if I often needed to switch between the projects, then I would use Repoman with custom config file: repoman -c .project1.json get .

Check out the README on GitHub for more usage examples, and npm install -g repoman away!