---
title: The One-Method Silent Killer in Your Cucumber Suite
teaser: A regular expression bug in `selector_for`.
tags: web,testing,ruby,good code
author: Jason Morrison
published_on: 2011-11-03
---

There's a single method in your Cucumber suite, right now, that could silently
allow regressions and ambiguity to slip in without you ever knowing.  Guess
which one!  Actually, bonus points to you if you scroll down to the comments and
make a guess before finishing this post.

Did you guess?  If not, here are a few more hints.

It uses regular expressions.

It's a large cascading conditional statement.

It's well-intentioned, and serves as a central lookup and naming convention for
what would otherwise be magic constants.

## "I'll take 'deprecated methods' for a thousand, Alex."

Ready for the reveal?  It's `HtmlSelectorsHelpers#selector_for`.  Yikes!  Now
that I've named names, let's play everyone's favorite game, spot-the-regex-bug:

```ruby
module HtmlSelectorsHelpers
  def selector_for(locator)
    case locator

    # snip a few lines

    when /the "([^"]+)" story/
      story = Story.find_by_title!($1)
      "##{dom_id(story)}"

    # snipped another 100+ lines

    when /^the related link for the "([^"]+)" story$/
      story = Story.find_by_title!($1)
      "a#some_selector_for_story_#{story.id}"
  end
end
```

Spot it?  This one is fairly easy to find - a step definition like:

    When I click on the related link for the "Make some bread" story

would match both regular expressions, because the first regular expression
doesn't match against the beginning `^` or end `$` of the line.  Since the
`case` statement just evaluates from the top down, the first match wins.

How did I find that?  I was adding a new piece of functionality to click on the
related link, wrote that new selector, ran the test, and it failed.  I wrote my
passing implementation.  Still failed.  I tried it out in my browser, where
things worked normally.  It took some sleuthing to figure out that the wrong
selector was being used, and where it was coming from.

## Digging deeper

Some other issues are actually lurking somewhere in the 100+ snipped lines of
regexes.  Let's say you add another selector like this:

    when /^the comment button for the "([^"]+)" story$/
      # ...

If that is listed after the `/the "([^"]+)" story/` matcher, *it will never be
matched*.  If you're not careful, even if you are disciplined about reaching the
red state in <abbr title="Test Driven Development">TDD</abbr> while adding the
new feature, this can lead to subtle regressions later down the line.  Consider
a link or piece of content that gets moved around on the page: if the broad
regexp still matches it, you could end up with a false positive test.

So, what's the general issue here?  When I think about the set of selectors I
implement, I believe it's a design drawback to allow developers to add
ambiguously matching selectors.  It would be better if the `selector_for` method
were to give you early notice of multiple colliding matches.

What might such an implementation look like?  Well, if we used a construct other
than the `case` statement, then `selector_for` could evaluate all possible
matches, and raise an exception if there are multiple matches.

Consider this example:

    def selector_for(locator)
      selectors = {
        "the page" => "html > body",

        /the project dropdown/ => '#project-selection ul',

        /the "([^"]+)" story/ => lambda {
          story = Story.find_by_title!($1)
          "##{dom_id(story)}"
        },

        /^the related link for the "([^"]+)" story$/ => lambda {
          story = Story.find_by_title!($1)
          "a#some_selector_for_story_#{story.id}"
        }
      }

      matches = selectors.map do |selector, result|
        if locator =~ selector
          if result.respond_to?(:call)
            result.call
          else
            result
          end
        end
      end.flatten

      raise "Ambiguous matches!" if matches.size > 1

      matches.first
    end

(This is just an illustration.  I've no idea what the performance impact is
here, or if the above code works.  You could smooth out the interface a bit,
too.)

So - problem solved!  We can identify ambiguous matches.  Party hats all around.

## But wait, there's more

But is this the best we can do?  The root of ambiguity here is regular
expressions, and you, dear astute reader, have recognized that these selectors
are being used in Cucumber steps, and not just within step definition
implementations.  Hopefully, you're climbing into the commenting cockpit, ready
to take aim at the notion of this coupling of front-end implementation to the
big-A Acceptance test suite - a notion that has gotten [so much blog
heat](http://aslakhellesoy.com/post/11055981222/the-training-wheels-came-off)
lately.

In fact, `selector_for` was [removed in cucumber-rails
1.1.0](https://github.com/cucumber/cucumber-rails/commit/f027440965b96b780e84e50dd47203a2838e8d7d).
For new apps, it's a good change.  And we should upgrade our existing apps, and
generally move away from imperative steps if we're using them.  But is there
still something to `selector_for` that's worth keeping around?

Myself, I've liked using `selector_for` in the definitions of acceptance test
steps because it provides a central and well-named lookup for front-end
implementation details.  But I don't like-like it.  Really, we're trading one
set of magic constants (CSS selectors) for another (named selector strings), and
the presence of regexp-based dispatch in the land of plain old Ruby methods
starts to feel a bit out of place. Unnecessary. Even bulky:

```cucumber
    When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
      click_link(selector_for("the related discussion link for #{story_title}"))
      fill_in 'Comment', with: "What fun discourse"
      click_button 'Create Discussion'
    end
```

Wouldn't it be nice to just extract these to plain old Ruby methods?

```ruby
module HtmlSelectorsHelpers
  def related_discussion_link(story_title)
    story = Story.find_by_title!(story_title)
    "a#some_selector_for_story_#{story.id}"
  end
end

World(HtmlSelectorsHelpers)
```

...

```cucumber
When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
  click_link(related_discussion_link(story_title))
  fill_in 'Comment', with: "What fun discourse"
  click_button 'Create Discussion'
end
```

Now, we have less clunky dispatch and more cohesive step definitions which
operate at [a single level of
abstraction](http://terryhowe.wordpress.com/2009/10/02/clean-code-functions/).

It's also good because this moves away from encouraging the coupling of Cucumber
steps to implementation details.  It encourages less string matching, and more
plain Ruby method-calling.

But what about our original issue, ambiguity?  It's still possible to overwrite
a method in the `HtmlSelectorsHelpers` module, although I suppose it's less
likely - the real reason I got tripped up at the beginning was that too-general
regular expression.  I think we've done well to mitigate that source of bugs.

## Bringing it on home

The one trouble with this approach is that it's not incremental.  I can't easily
convert all my existing `selector_for` code to use just methods in one fell
swoop.  This is the same pain point felt when talking about removing `web_steps`
from an existing suite - it's messy and time-intensive to change all that
string-matching lookup into method dispatches.  But, worth it?  I'd argue that
it is.

If you're writing Cucumber, do you use `selector_for`, spurn it despite its
availability, or use a new version without the method?  Do you place your
selectors directly in step definitions?  Do you have some other lookup facility?
