Refactoring: You Keep Using that Word, I Do Not Think It Means What You Think It Means

“Refactoring,” as a term, is rapidly becoming meaningless. The final nail in the coffin, for me, was this morning when I heard a sales guy say:

I need to refactor the pipeline in Salesforce—it’s got a bit messy and needs tidying up.

When a term has become so generic that sales guys start (mis-)using it, it’s lost all value.

He’s not alone. I’ve lost count of how many times I’ve heard software engineers who should know better say “refactoring” when what they actually mean is “changing.”

Refactoring (actual refactoring, not redesigning, reimplementing, debugging, …) is one of the biggest advances in software development over the last couple of decades. It is the process of improving the design of existing code without changing its behavior. It relies on two key insights:

  • Modifying existing code can be carried out safely only with the safety net of a comprehensive suite of tests.
  • We should never attempt to refactor the code at the same time as modifying its behavior.

In other words, you can modify the behavior of the code, or you can refactor it. You should never attempt to do both at the same time.

Upon reflection, it’s easy to see why this is the case. Imagine that you attempt to modify both the structure of your code and its functionality at the same time, and after doing so one of your tests fails. This might indicate that you made a mistake when modifying its structure. Or it might be an expected result of the change in functionality. It’s difficult, however, to be sure which. The more complicated the change in functionality or structure, the harder it is to be certain.

By doing only one or the other, you avoid this issue entirely and can forge ahead with potentially far-reaching refactorings involving dramatic changes to the code with confidence.

I fear that it’s too late; “Refactoring” is used incorrectly so often that it’s probably beyond rescue. But I hope not.

Travis, Docker and Elastic Beanstalk

Elastic Beanstalk now supports Docker, which is a wonderful thing. I’ve just spent an unreasonably long time trying to work out how to get this working nicely with a CI system running on Travis, though, and thought I’d save others the same experience.

My first thought was that I should get Travis to build a Docker image, but this is a non-starter because Travis runs on Ubuntu 12.04 LTS, which requires a kernel upgrade to be able to run Docker. Even if this did work, it wouldn’t be a great solution—downloading a base Ubuntu image after every build would be dreadfully wasteful.

The “aha” moment came when I realised that you don’t need to create a Docker image to deploy to Elastic Beanstalk. Instead, you can have Elastic Beanstalk build the image for you. So all Travis needs to do is create a Zip archive containing a Dockerfile and any build artefacts it references. This archive can be deployed to, say, S3 and Elastic Beanstalk takes things from there.

Very slick.

Pronking for programmers

If you watch wildlife documentaries, you’ve probably seen pronking, or stotting—the strange four-legged jump that gazelles do:

Pronking is Afrikaans for “showing off,” and that’s certainly part of what’s going on, but it’s not quite as simple as that. Gazelles don’t only pronk to show off—they also pronk when they’re being chased by a predator. I always assumed that this was about evading the predator by being unpredictable, but it turns out that there’s a rather surprising alternative explanation.

Pronking slows the animal down and uses up energy that (you might think) would be better spent running away. But it works—animals that pronk are less likely to be caught than those that don’t. So what’s going on?

The theory is that pronking is a way for a gazelle to demonstrate that it’s fit. “Don’t bother chasing me—I’ll be too difficult to catch. Chase my cousin over there, the one with the limp—he’ll be much easier to catch.”

Pronking works because it’s expensive. If every gazelle could pronk, predators would pay no attention. But only fit gazelles can, so it’s an honest signal.

So what does this have to do with programmers? Too many pizzas and donuts mean that quite a few of us won’t be able to literally pronk, but that doesn’t stop us from doing it in our own way…

I think that we pronk when we contribute to open source.

There are lots of reasons to contribute to open source. It’s a great way to improve the tools we all rely on, give back to the community, and learn new skills. But, let’s be honest, it’s also a great way to show off and demonstrate our fitness.

And it works—when I find myself sifting through a pile of CVs, evidence of significant open source contributions is virtually guaranteed to make sure a CV doesn’t find itself on the “rejects” pile. And like pronking, it works because it’s expensive—the time and effort required is concrete evidence that the candidate cares about their craft.

So if you can pronk, perhaps you should?

ScalaMock Status Report

Apart from a little work to make it compatible with the most recent versions of Scala and ScalaTest (thanks to Chua Chee Seng, Duncan Crawford and Chris Birchall) ScalaMock has been moribund over the last 12 months. This is partly because my focus has been on writing Seven Concurrency Models in Seven Weeks, but mostly because of insurmountable issues with Scala’s current macro support.

There is good news on both fronts, however. First, I recently finished the book, so I should now have more time to devote to ScalaMock. And second, Eugene has announced Palladium, which looks like it should provide everything that ScalaMock needs.

So where is ScalaMock now? And where is it (hopefully) going?

Where is ScalaMock today?

ScalaMock 3 supports two types of mock, proxy mocks and macro mocks.

Proxy mocks are not type checked and have a slightly less convenient syntax than macro mocks, but they work with any trait and are fully supported.

Macro mocks are type checked and have a nicer syntax than proxy mocks. They work well with most traits, but fail for traits with “complex” type signatures.

The good news is that macro mocks either work or give a compile error (there should be no situations where a macro mock compiles but gives “odd” results). And if you have a trait for which macro mocks don’t work, you can always use a proxy mock instead.

Where is ScalaMock going?

As soon as Palladium is available, I plan to start working on ScalaMock 4. If Palladium delivers on its promise, ScalaMock 4 should be able to mock any trait, no matter how complex its type. In addition, I expect that it will also support:

  • Improved syntax:


    instead of:

    (mockObject.method _) expects (arguments)
  • Mocking object creation (constructors)
  • Mocking singleton and companion objects (static methods)
  • Mocking final classes and classes with final methods or private constructors

An Open Letter to The Gregory James Group

Dear employees of The Gregory James Group,

Please forgive this indirect and public means of communicating with you, but it appears that all other means of doing so are ineffective.

I have repeatedly asked for you to remove me from your contact database and stop sending me unsolicited e-mails or making unsolicited telephone calls. I would prefer not to expose your flagrant contempt for others’ time publicly, but this is a final throw of the dice in an attempt to get you to stop spamming me.

Below is a representative sample of the e-mails I’ve sent you over the years (yes, years). Sadly, I don’t have transcripts of the telephone conversations, but they are similar in nature and number to the e-mails.

From: Paul Butcher
Subject: Re: Graduate Android Developer (MSc. Software Engineering) – Available Immediately
Date: 10 September 2013 12:23:13 BST
To: “Charley Lucas” <>

See below a selection of e-mails that I’ve sent to various members of your company over the last several years. What, exactly, does it take to get off your e-mail list?
From: Paul Butcher
Subject: Re: Senior C# ASP.NET MVC3 Programmer
Date: 13 August 2013 17:33:55 BST
To: “Roxanne Gordon” <>
Cc: Tom Townsend <>
I have no idea why you think I may be looking for a developer of this type. I most certainly am not, and it’s completely inappropriate for the kind of things that we do here.
I recently had a long exchange with your colleague Tom Townsend (copied) to try to stop him from spamming me with unwanted and inappropriate e-mails. I finally managed to persuade him to stop doing so and remove me from your company’s mailing list. I don’t know how I have ended back on it, but please remove me immediately and do whatever you need to do to ensure that I never find my way onto it again.
I don’t know what it is about Gregory James that makes you think that it’s OK to waste people’s time, but I really wish you would stop.
From: Paul Butcher
Subject: Re: Mobile Application Development
Date: 28 January 2013 10:02:23 GMT
To: “Tom Townsend” <>

I fail to understand how my earlier request that you please not mail me again was open to misinterpretation. I would be grateful if you could accede to my very clearly expressed wishes.
From: Paul Butcher
Subject: Re: iOS Development
Date: 15 January 2013 15:41:08 GMT
To: “Tom Townsend” <>
No. Please do not mail me again
From: Paul Butcher
Subject: Re: iOS Development
Date: 15 January 2013 15:37:03 GMT
To: “Tom Townsend” <>
Please remove me from your mailing list.
From: Paul Butcher
Subject: Re: Ruby on Rails Contract London £350 – £450 Per Day D.O.E
Date: 21 January 2011 11:47:06 GMT
To: “Christopher Barlow” <>

I have no idea how I got onto your mailing list. Please remove me from it immediately and confirm that you have done so by return.

From: Paul Butcher
Subject: Inconsiderate use of e-mail
Date: 1 October 2010 14:12:00 BST
Cc: David Christian <>

Please forgive the wide distribution of this e-mail, but unfortunately your colleague David Christian apparently does not believe in treating people with respect, so I am left with no option.

Somehow, I have ended up on David’s indiscriminate database of people who might be interested in web development contract work. I am not interested in this kind of work, have never asked to be on his database and have no clue who he, or you, are. I have asked him nicely to remove me from his database and he hasn’t had the courtesy to respond or to honour my polite request. I am sure that I don’t need to tell you how this kind of behaviour reflects on your organisation as a whole.

As it happens, I am a software development manager who regularly makes hires through recruitment consultants. This experience guarantees that I will be doing no business with The Gregory James Group.

I would be grateful if you could persuade David not to send me any further unsolicited messages.

From: Paul Butcher
Subject: Re: Contract RoR/PhP Work Remotely
Date: 30 September 2010 13:25:03 BST
To: David Christian <>
I am not looking for work, and have never asked to be on your database. Please remove me from your database, send me an e-mail confirming that have done so, and then send me no further communications.

Folding over a lazy sequence with Clojure reducers

Clojure 1.5 includes reducers. They’re currently experimental, but showing great promise.

One of reducers’ features is that they allow parallel fold operations. Sadly, although fold works when given a lazy sequence, it falls back to a sequential reduce.

There’s no theoretical reason why fold couldn’t work in parallel with a lazy sequence, so I decided to see if I could implement it. It turns out to be very easy. I’ve implemented a function called foldable-seq that takes a lazy sequence and turns it into something that can be folded in parallel. I’ve checked an example program that uses this to count words in a Wikipedia XML dump into GitHub. The code for foldable-seq is here.

On my 4-core MacBook Pro, the example program runs in approx, 40 seconds without foldable-seq and 13 seconds with.

Benchmarking Producer/Consumer in Akka, Part 2

Recently, I published some benchmarks of Producer/Consumer in Akka. This post is a followup with more detail.

I have modified my test program (which is available on GitHub) to implement three different basic algorithms:

  • Producer pushes to a bounded queue (i.e. producer blocks when the queue is full)
  • Producer pushes to an unbounded queue, plus flow control to ensure that the queue doesn’t grow too large
  • Consumer pulls

For each of these, I then benchmarked different variants as follows:

push_bal: Producer pushes to a bounded queue, with a BalancingDispatcher

push_rr: Producer pushes to a bounded queue, with a RoundRobinRouter

push_sm: Producer pushes to a bounded queue, with a SmallestMailboxRouter

flow_bal: Producer pushes to an unbounded queue, with a BalancingDispatcher

flow_rr: Producer pushes to an unbounded queue, with a RoundRobinRouter

flow_sm: Producer pushes to an unbounded queue, with a SmallestMailboxRouter

pull_1: Consumer pulls, with a batch size of 1

pull_10: Consumer pulls, with a batch size of 10

pull_20: Consumer pulls, with a batch size of 20

pull_50: Consumer pulls, with a batch size of 50

pull_cached: Consumer pulls, with an intermediate cache

Here is a spreadsheet of the results, including graphs, I get on my MacBook Pro (Core i7, 4 cores, 2 hyperthreads per core). Each test is run multiple times and the results averaged.

Updated results, incorporating suggestions from Viktor can be downloaded here.


Note: These conclusions are different from the initial version of this post, following suggestions from Viktor.

The headline is that the difference between the approaches is so small as to be almost irrelevant:

  • For the push to a bounded queue version, BalancingDispatcher clearly outperforms both RoundRobinRouter and SmallestMailboxRouter, especially as the number of workers increases. Strangely the same difference is not present for the unbounded queue version.
  • The pull approach can be made to perform virtually identically to the push model by batching queries.

Benchmarking Producer/Consumer in Akka

Further to this conversation on the Akka mailing list, I decided to benchmark various different approaches to implementing the producer/consumer pattern.

I wanted to choose a “real” problem, so I decided to count the words on the first 100,000 pages of Wikipedia. The producer parses the Wiki XML dump and the words are counted page-by-page by a pool of consumers.

I implemented three different approaches – producer pushes to a bounded queue, producer pushes to an unbounded queue together with a flow control protocol, and consumer pulls.

The source code for the different implementations is here, and the results are at the bottom of this message.

Some observations:

  1. I only timed to the nearest second as I see an approx. 3 second variation from run to run with identical parameters. I’m not sure why I see such a large variation—suggestions welcome.
  2. There’s basically no difference between the two “producer pushes” implementations. The “consumer pulls” implementation is much slower, however
  3. I tried both Dispatcher and BalancingDispatcher and RoundRobinRouter and SmallestMailboxRouter in the producer pushes implementations—the differences were too small to measure

I’m surprised that there is so much difference between producer pushes and consumer pulls. It’s quite possible that I’ve done something stupid in the implementation—I’d be very grateful for a pointer to what it is.

Here are the results (all on my i7 MacBook Pro—4 cores, 2 hyperthreads per core).

Bounded Unbounded Pull
Consumers Seconds Speedup Seconds Speedup Seconds Speedup
1 100 1.00 103 0.97 156 0.64
2 55 1.82 56 1.79 88 1.14
3 43 2.33 40 2.50 65 1.54
4 39 2.56 38 2.63 55 1.82
5 37 2.70 37 2.70 51 1.96
6 34 2.94 33 3.03 44 2.27
7 33 3.03 33 3.03 47 2.13
8 33 3.03 35 2.86 46 2.17
9 33 3.03 32 3.13 46 2.17

ScalaMock3 step-by-step

This post describes ScalaMock 3, which supports Scala 2.10 only. For ScalaMock 2, which supports earlier Scala versions, go here.

This post describes how to setup a project that uses ScalaMock in conjunction with ScalaTest and sbt. The sample code described in this article is available on GitHub.

The example assumes that we’re writing code to control a mechanical turtle, similar to that used by Logo programs. Mocking is useful in this kind of situation because we might want to create tests that function even if we don’t have the hardware to hand, which run more quickly than would be the case if we ran on real hardware, and where we can use mocks to simulate errors or other situations difficult to reproduce on demand.

  1. Create a root directory for your project:
    $ mkdir myproject[/soucecode]
    	<li>Create <code>build.sbt</code> containing:
    organization := "com.example"
    version := "1.0"
    scalaVersion := "2.10.0"
    scalacOptions ++= Seq("-deprecation", "-unchecked")
    libraryDependencies +=
      "org.scalamock" %% "scalamock-scalatest-support" % "3.0" % "test"
  2. Now we’ve got a project, we need some code to test. Let’s start with a simple trait representing a turtle. Create src/main/scala/Turtle.scala containing:
    package com.example
    trait Turtle {
      def penDown()
      def penUp()
      def forward(distance: Double)
      def turn(angle: Double)
      def getPosition: (Double, Double)
      def getAngle: Double
  3. The turtle API is not very convenient, we have no way to move to a specific position, instead we need to work out how to get from where we are now to where we want to get by calculating angles and distances. Here’s some code that draws a line from a specific point to another by doing exactly that.Create src/main/scala/Controller.scala containing:
    package com.example
    import scala.math.{atan2, sqrt}
    class Controller(turtle: Turtle) {
      def drawLine(start: (Double, Double), end: (Double, Double)) {
        val initialAngle = turtle.getAngle
        val deltaPos = delta(start, end)
        turtle.turn(angle(deltaPos) - initialAngle)
      def delta(pos1: (Double, Double), pos2: (Double, Double)) =
        (pos2._1 - pos1._1, pos2._2 - pos1._2)
      def distance(delta: (Double, Double)) =
        sqrt(delta._1 * delta._1 + delta._2 * delta._2)
      def angle(delta: (Double, Double)) =
        atan2(delta._2, delta._1)
      def moveTo(pos: (Double, Double)) {
        val initialPos = turtle.getPosition
        val initialAngle = turtle.getAngle
        val deltaPos = delta(initialPos, pos)
        turtle.turn(angle(deltaPos) - initialAngle)
  4. We can now write a test. We’ll create a mock turtle that pretends to start at the origin (0, 0) and verifies that if we draw a line from (1, 1) to (2, 1) it performs the correct sequence of turns and movements.

    Turtle diagram

    Create src/test/scala/ControllerTest.scala containing:

    package com.example
    import org.scalatest.FunSuite
    import org.scalamock.scalatest.MockFactory
    import scala.math.{Pi, sqrt}
    class ControllerTest extends FunSuite with MockFactory {
      test("draw line") {
        val mockTurtle = mock[Turtle]
        val controller = new Controller(mockTurtle)
        inSequence {
          inAnyOrder {
            (mockTurtle.penUp _).expects()
            (mockTurtle.getPosition _).expects().returning(0.0, 0.0)
            (mockTurtle.getAngle _).expects().returning(0.0)
          (mockTurtle.turn _).expects(~(Pi / 4))
          (mockTurtle.forward _).expects(~sqrt(2.0))
          (mockTurtle.getAngle _).expects().returning(Pi / 4)
          (mockTurtle.turn _).expects(~(-Pi / 4))
          (mockTurtle.penDown _).expects()
          (mockTurtle.forward _).expects(1.0)
        controller.drawLine((1.0, 1.0), (2.0, 1.0))

    This should (hopefully!) be self-explanatory, with one possible exception. The tilde (~) operator represents an epsilon match, useful for taking account of rounding errors when dealing with floating-point values.

  5. Run the tests with sbt test:
    $ sbt
    > test
    [info] ControllerTest:
    [info] - draw line
    [info] Passed: : Total 1, Failed 0, Errors 0, Passed 1, Skipped 0

ScalaMock 3.0-M4 for Scala 2.10.0-RC1

ScalaMock 3.0-M4 (scaladoc) for Scala 2.10.0-RC1 is now released. It supports:

  • Mock functions, traits and classes
  • Both expectation-first and record-then-verify (Mockito-style) mocking
  • ScalaTest and Specs2

To use with sbt and ScalaTest:

libraryDependencies +=
  "org.scalamock" % "scalamock-scalatest-support_2.10.0-RC1" % "3.0-M4"

or for Specs2:

libraryDependencies +=
  "org.scalamock" % "scalamock-specs2-support_2.10.0-RC1" % "3.0-M4"

For background information, see ScalaMock 3.0 Preview Release.

Known limitations (these should all be fixed when Scala adds support for mock types):

  • No support for mocking object creation (constructors)
  • No support for mocking singleton/companion objects
  • No support for mocking final classes or classes with private constructors
  • No support for mocking concrete vars
  • Limited support for overloaded methods
  • No support for mocking Java methods with repeated parameters