Adding tests to legacy applications

I came across this great talk from Jason Swett today; it’s short and quite fun, but really good.

Some key takeaways were

Test the easy things first

Things like the checkout process on a shopping app might be the most valuable places to have tests, but are also likely to be amongst the hardest to test. When you’re adding tests to an application that already exists, there’s a lot of groundwork to do in creating infrastructure just to get the tests running in the first place, so aim to make the first test you write as easy as possible – aim for something like the login flow which is likely to be much easier to test.

Use Sprout Method/Sprout Class approaches

When you have code written without tests, you’ll tend to get quite long, knotty, interrelated methods which know far too much about the rest of the system and which have a lot of dependencies. You’re also going to struggle to know what a method does in many cases. In order to break this up in a sane way, as recommended by Michael Feathers in Working Effectively with Legacy Code and discussed in Refactoring by Martin Fowler, you can use the Sprout Method and Sprout Class approaches. Here, you can work through a complicated method line by line, extracting each line or clump of related lines that do one thing into their own method – maybe even their own class. Extracted in this way each method will have only one or two dependencies at most and therefore becomes possible, sometimes easy, to test.

Characterisation tests

Obviously it’s hard to write a test when you don’t know what the code returns, so writing characterisation tests is a useful technique. Your mindset is a bit different in this phase – you’re not necessarily thinking critically about whether it’s doing the right things, but like a stenographer you’re simply recording what the code does.

Jason will comment out an entire method body and then bring it back line by line, and for each line write a failing test that then passes when the line is uncommented. Doing this when you don’t know what it does is still possible – just write a test asserting something you know is wrong, like asserting a method returns “qwerty”, and then when the test fails see what it returns and then amend the test to expect that thing.

Other stuff

There was a good QA at the end; one tip he mentioned to help deal with flapping tests was if the issue is due to a race condition, where something else might always not have completed before the test tries to click on the next step in the sequence, he might write an extra expectation for the page to “have content such-and-such” first before trying the click, looking for something that appears on the page once the action has completed – this will always wait a reasonable time before timing out, rather than failing instantly like the click instruction.

Rails 6 is released! Rspec-rails needs some tweaking to work

David Heinemeier Hansson announced on the Rails blog that Rails 6.0 is released, which makes it one of the chunkiest new releases I can remember – lots of exciting stuff to get used to, including Webpacker in the deployment pipeline by default, parallel databases, Trix now fully integrated into the framework as Action Text, and major improvements to existing features.

Rails 6.0: Action Mailbox, Action Text, Multiple DBs, Parallel Testing, Webpacker by default, and Zeitwerk – Dealing with incoming email, composing rich-text content, connecting to multiple databases, parallelizing test runs, integrating JavaScript with love, and rewriting the code loader.

I’ve just installed it and it’s obvious how much thought and care has gone into the new version.

Hit a hurdle with running Rspec straight out the gate though, an error

ActionView::Template::Error wrong number of arguments (given 2, expected 1)

but found the issue had been identified and fixed in a patch to the new version 4.0 of rspec-rails

rails 6 rspec error

It’s not released yet, but you can add the pre-release version of the rspec-rails gem to your Gemfile from Github and run bundle install to get your specs working again:

group :development, :test do
  gem 'rspec-rails', 
    git: '', 
    tag: 'v4.0.0.beta2'

Five Factor Testing by Sarah Mei

This was an excellent read; we spend a fair bit of time wondering what to test and relatively little on why we’re testing. This post changed how I think about testing.

Key points – Sarah describes five reasons for writing automated tests:

  1. Verify the code is working correctly
  2. Prevent future regressions
  3. Document the code’s behaviour
  4. Provide design guidance
  5. Support refactoring

All pretty unexceptionable; but the bit that was new to me was, those things aren’t mutually compatible, and there are always tradeoffs – tests that thoroughly document the functionality at a low level can make it much harder to refactor or add new functionality, for example, and the right mix of tests depends on your individual situation – how important to you and your project is each of the five factors?

Test coverage metrics considered actively harmful

This is a fantastic blog post from James Shore on making sure you have the right tests, tests that will save you from stuff instead of simply trying to push up an abstract number. It’s a great read and thought provoking. For example:

If you’re using test-driven development, don’t measure unit test code coverage. It’s worse than a useless statistic; it will actively lead you astray.

To build up tests in legacy code, don’t worry about overall progress. The issue with legacy code is that, without tests, it’s hard to change safely. The overall coverage isn’t what matters; what matters is whether you’re safe to change the code you’re working on now.

Of course, no advice is the right advice in 100% of situations and he points out some useful counter-examples too. But I loved this pragmatic approach to testing.

So instead, nurture a habit of adding tests as part of working on any code. Whenever a bug is fixed, add a test first. Whenever a class is updated, retrofit tests to it first. Very quickly, the 20% of the code your team works on most often will have tests. The other 80% can wait.

The Ruby Object Model: classic talk by Dave Thomas

At pretty much every Ruby conference I’ve been to in the last few years, there’s been a talk that includes an explanation of the eigenclass or metaclass of a class and how Ruby goes about looking up where to call methods or locate variables. Like water dripping on a rock, over time this has gradually made more and more sense but it’s always been a bit too detached from the normal day to day stuff I do to really trigger my imagination.

But this talk from 2009 by Dave Thomas was amazing, and was more than just another drip. Highly recommended viewing; it’s a fun and engaging exploration of the fundamentals of how Ruby works by one of the community’s foremost members.

What a great team feels like

A good friend and valued colleague N left our team last week. He started soon after me and completely changed the dynamic within the team; we’re reaching the end of the project and all soon to be going our separate ways anyway, but the impact he made will stay with me for a long time.
One of the best things about our team was the differences in outlook, personality and approach we each brought to the table. We had developers working from Slovenia, Portugal, Spain, and remotely in the UK as well as co-located in the office, and forged strong working relationships in this distributed group; everyone with different strengths and backgrounds, but all sharing a passionate belief in doing things well and making the team better.
I’ve been a web developer for 20 years this year. I’ve worked on my own for a lot of that time, and similarly since moving to Ruby and Rails in 2013. One of the things that attracted me to Ruby in the first place was the community and the sense of diversity and acceptance that runs through it. “Keep Ruby Weird” is still alive, I think. And there’s something different and special about developers that choose to work with Ruby and Rails; every language and framework exists for a reason and those reasons resonate with us on a deep personal level. That’s why I think distinctive communities form around them.
Previously, even though I’ve been working in agile teams for the last 13 years, I’ve really been working alone in my own silo most of the time. This project was my first time working fully as part of a modern Rails team, to collaborate with other developers using Github, to learn to test all the time, to be pushed out of my comfort zone and learn how to do things better. 
I hoped this would be the case before I started, and it has. But what I didn’t expect – and what has been by far the most important part of the experience here – was this supportive culture that unfolded from N’s first day; that of helping each other, of every problem being a problem shared. The atmosphere before I joined was pretty toxic. Nobody spoke, nobody shared anything. There was just the sound of one of the senior developers sniggering as he reviewed PRs with acid little comments. 
And then N came along… “Share your screen with me”
How often we heard those words from him when one of us was stuck, and it always meant things were about to get better. Whether or not the problem got fixed in the end, we’d explore it and learn more about the codebase and the problem together. People who weren’t so good at testing would learn how tests can help us explore what’s going on, illuminating the problem and leading us to the answer. 
Everything changed the day N joined, and I think we’ve built a better team largely because of him. This isn’t a great team yet. But it could be and thanks to him and the other excellent developers who have since joined, we had a real glimpse of what one looks like.

Great talk on code review

Working with some terrific developers at Kajima for the last few months has helped me to learn loads; it’s been great working in a big Rails team for the first time since moving to Ruby from PHP back in 2013, and one of the biggest changes to working alone on projects is having a peer review process for code before it gets merged to master.

As a developer, you always have this imaginary monkey on your shoulder anyway, whispering to you every time you cut corners. “That’s not what you’re supposed to do…”

Working in a team, that imaginary monkey is replaced by real, warm-blooded human beings and the whispering is replaced by written comments on Github pull requests – suddenly you’re held to account for your decisions and working to meet a commonly agreed bar for what good, acceptable code looks like.

One of my colleagues shared this great talk from Railsconf 2015, well worth watching:

Here are my notes:

We think that code review is for catching bugs – but really it’s not. Few bugs are caught this way, but we are all under the impression it catches more. Nope!
It’s for helping us to get better every day.
Of course, a few bugs get caught, but the real benefits include:
  • Knowledge transfer
  • Increased team awareness
  • Finding alternative solutions
The discipline of discussing code with your peers – the process is more important than the net outcome of the results or changes in the code
Improving the technical communication on your team
Strong code review culture isn’t seniors explaining things to juniors, it has to be a real conversation
Some rules from Thoughtbot
Two key things
1. As an author…provide sufficient context
  • “If content is king then context is god”? Content is the diff. Context is WHY that changed. What’s the problem your changes solve? Insufficient context is one of the chief impediments to good reviewing. So… commit messages that say WHAT are no good, must explain WHY these lines changed or were added. A link to Jira is not an explanation! Don’t make the reviewer hunt for this context. Also, the thing on the end of the link probably doesn’t explain the context for the change directly.
  • Challenge – at least two paragraphs of context for every change you make!
2. As a reviewer… ask questions rather than making demands
  • ask, don’t tell
  • Negativity bias in written communication compared to verbal. Inflection compensates far more than we realise, the absence needs to be supplied in writing. Saying the same words in writing that you’d say will be interpreted harshly or critically.
  • Engage in conversation, don’t dictate what to do. Avoid imperatives
  • Leads to ignoring or arguing (ignoring is worse). Give the author credit for having considered your suggestion and assume best intentions. Frame it as a question! “What about xxxx?”
  • Aim is to foster technical discussion
  • “What do you think about…?”
  • “Did you consider…?”
  • “Can you clarify…?”
Imperatives are particularly harmful when difference in level – juniors will ALWAYS do what seniors tell them but rarely understand why
  • Why didn’t you just …? (Just is a bad word)
  • Why didn’t you… ? Still puts on defensive, frames negatively (prefer “How about” instead of “why not”)
Socratic method – ask questions to stimulate critical thinking and illuminate potential alternatives
How to handle conflict?
Conflict is good, healthy debate drives quality and leads to learning. But two types
Good – we don’t agree on an issue
Once you reach a minimum bar of quality, we’re now talking about tradeoffs. “Not the way I’d do it” is not in itself a reason to change things.
What if we don’t agree on the process (committing to master, ignoring feedback etc) Get ahead of this by establishing the social contract that everyone is agreed with
You can review any code any time, just because it got committed doesn’t mean there’s no value to reviewing it.
What to review – if I’m not catching bugs, what am I doing this for?
Timing – ten minutes is a long time to spend on a review. Small changes are easier to review
  • Single responsibility principle – does every object have just one job
  • Naming things
  • Complexity
  • Test coverage
  • Web security
  • Style (but style nitpicks lead to reviewers being perceived negatively!)
But it’s fine for people to review in their own way, bringing their own strengths
“So and so approved it” not doing QA
Adopt a style guide to remove potential for pointless arguments and use Rubocop. If all code passes linting then there will be a lot fewer style comments.
Insist on high quality reviews, but agree to disagree.
Review what’s important to you
Aim to have a strong code review culture – valuable in itself.
  • Better code
  • Better developers
  • Team ownership – this is _our_ code
  • Healthy debate (not silent seething)
The fact reviews are asynchronous may be a benefit; means the PR and diffs have to stand on their own.


Really thrilled that Hireist has been accepted onto YCombinator’s Startup School’s autumn cohort. It’s a sort of online startup incubator, free, online. Last year around 13,000 companies applied. Everyone who applies gets to watch the lectures free, but there’s a selection part where around one in five startups gets accepted onto a group of about thirty companies/founders to be mentored by one of about a hundred YC alumni. And that’s what we’ve gotten onto.

So it’s not particularly exclusive – we’re probably one of about 3,000 companies in the mentoring groups this time round. But it’s not a poke in the eye, either!

As a graduate of Bethnal Green Ventures startup incubator, it’s not like the startup world is alien to me. But YC has a deserved reputation; there are talks from Sam Altman and none other than Paul Graham himself. So, I’m delighted we got into the mentoring groups and although there’s not much chance of scoring the $10,000 they’re awarding to a hundred of the startups, getting this sort of content and guidance is awesome.

First lecture today… can’t wait!

Fixing order dependent tests in Minitest

Working on, a Rails 5 app which uses Minitest and Capybara, I ended up with some order dependent test failures that completely derailed things for a few days. In the end my brilliant son helped me to solve the problem and hopefully the key points will help someone else!

So, the app is a SaaS product and uses subdomains to separate tenant accounts. When the failure occurred, it was because the @current_account was unset in situations where it should have been and we ended up calling methods on nil. Those tests worked fine on their own, but during the whole test run, something was happening to make it mess up; some state or unsuspected dependency leaking from a previous test.

So I remembered Ryan Davis talking about this in his talk during RailsConf 2015, and how you could use minitest-bisect to do the leg work of trying to narrow down which other tests caused this dependency and failure. So, here were the steps I went through over the course of several days in order to get a test suite we could trust once again!

Step 1 – realise you have a problem!

This bit is easy. If your test suite runs green on one run, then throws up lots of red on a second when there have been no changes to the code, you have an order dependent test failure.

Step 2 – identify the test dependencies

Install minitest-bisect in your gem file in the development group and run your tests. Here’s the first gotcha, however. The readme and instructions are for a generic Ruby app; if you’re using Rails, you need to pass it the right file names to act on and finding the right incantation took a little googling!

minitest_bisect --seed 45514 -Itest $(find test -type f -name \*_test.rb)

Run this in terminal in the app directory, replacing the above seed with the value from your own unsuccessful test run.

This should take a few minutes to run, and will gradually narrow down the set of files and an order to run them in which generate the failure.

Eventually you should get something like

...many things...

Final reproduction:

Run options: --seed 45514 -n "/^(?:Create account Feature Test#(?:test_0001_Load the right home page)|Manage::JobsControllerTest#(?:test_should_create_new_job|test_should_get_index))$/"

Now you can pass the options displayed to the test runner and reproduce the subset of the problem.

Step 3- Identify the conditions which cause the failure

Steps 1 and 2 are, sadly, the easy bit. The tricky bit is then working out what is happening and what to do about it! But at least now you have a workable surface area to attack. In this case, just in case anyone else hits this slightly unusual edge case, our app uses the ActsAsTenant gem and requires a subdomain for particular areas to work properly, so we use a setup method with the host! method to set it before the tests that need this.

def setup
  host! ""

But the problem was that if prior to those test files running, a feature test was run that used “visit xxx_path”, the host! method in later tests no longer had any effect because the URL had already been set.

feature "View home page" do
  scenario "generic home page" do
    visit root_path
    page.must_have_content "Hireist"
  scenario "home page for account" do
    visit "http://verso.hireist.test"
    page.must_have_content "Verso"

Step 4 – Work out the dependency and fix it!

This is where I hit a brick wall, and my son, a talented Ruby developer, stepped in to help. It took some time but eventually using byebug and pry we were able to step through the code and dig down into the framework to try to find out where the domain was being set and why it leaked across test instances. We found that when a feature test runs, it set the default_url_options to “test.local” and this got set in a class variable – hence the persistence across test runs.

Because our integration test had the relevant classes in its ancestry chain, it turned out we could access the options hash inside our tests and reset the URL options before our test runs, so that the host! method would always have the desired effect.

In our test_helper.rb:

def clean_url_environment
  app.routes.default_url_options = {}

And in our setup method in the controller tests:

def setup
  host! ""

This was a tough one to fix, and it took sustained effort and more than one pair of eyes to do it. But without the ability to bisect the test runs (thanks to Ryan Davis), we’d probably never have gotten to the root cause.

How many boy scouts?

Sometimes we find ourselves working on stuff that is less than scintillating, maybe discussing the implementation details of yet another pointless cookie notification bar or whether it would be a bad thing to implement one of those things that highlights random words on your site like “gambling” and pops up ads if you accidentally mouse over them [yes! – Ed.]; at such times it’s sometimes tempting to dream you were instead working in Silicon Valley on some ground-breaking project like self-driving cars.

I have an antidote for that though; suppose you really were working on the software that guides a self-driving car. That would be awesome, right? Even the worst implementation of an AI-controlled vehicle is soon likely to be many times safer for the passengers and the world at large than even the best human driver, and there are really smart people working on the problems right now.

I’ve been reading this amazing essay from Paul Ford, “What is Code?”, and in it he explains why if those rockstar ninja developers exist, they’re probably going to have pretty high standards as to what problems they work on and are therefore probably not going to be sitting in the next cubicle in a typical dev team.

They’re not interviewing at your crappy company for your crappy job. They’re not going to come and rescue your website; they’re not going to make you an app that puts mustaches on photos; they’re not going to listen to you when you offer them the chance to build the next Facebook, because, if they exist, they are busy building the real Facebook.

Very astute, right? But it was this next sentence that made me glad I’m not one of them.

Sometimes they’re thinking about higher mathematics, or how to help a self-driving car manage the ethical choice between running over a squirrel and driving off a cliff. Or they’re riding their bikes, or getting really into pottery.

If you’re the one designing the anti-collision software, you’re the one who has to tell it how to decide in emergency situations. So the squirrel case is pretty easy (once you train it to reliably differentiate between a child in a furry hat and a squirrel, but that’s another matter) – bye bye Mr Squeaks, right?

But how would you suggest to your automaton that it handle the sort of situation that psychologists use to bother otherwise happy people; like, assuming that its brakes have failed and it’s hurtling down a hill, able to steer but not stop, and it has to choose the lesser of two evils.

Do you suggest it should stay on the road and aim, for instance, at the mother and a child in a pushchair on the pedestrian crossing (two people with their lives ahead of them), or steer into the bus queue of twenty older people instead?

But what if there are boy scouts in the bus queue too?

If you’re a rockstar programmer and your next question is, “how many boy scouts?” there may be a job for you in Mountain View. Tell them I suggested you give them a call.

Now, what colour would you like your cookie notification bar, Mr Smithers?