There’s a single method in your Cucumber suite, right now, that could silently allow regressions and ambiguity to slip in without you ever knowing. Guess which one! Actually, bonus points to you if you scroll down to the comments and make a guess before finishing this post.
Did you guess? If not, here are a few more hints.
It uses regular expressions.
It’s a large cascading conditional statement.
It’s well-intentioned, and serves as a central lookup and naming convention for what would otherwise be magic constants.
“I’ll take ‘deprecated methods’ for a thousand, Alex.”
Ready for the reveal? It’s HtmlSelectorsHelpers#selector_for
. Yikes! Now
that I’ve named names, let’s play everyone’s favorite game, spot-the-regex-bug:
module HtmlSelectorsHelpers
def selector_for(locator)
case locator
# snip a few lines
when /the "([^"]+)" story/
story = Story.find_by_title!($1)
"##{dom_id(story)}"
# snipped another 100+ lines
when /^the related link for the "([^"]+)" story$/
story = Story.find_by_title!($1)
"a#some_selector_for_story_#{story.id}"
end
end
Spot it? This one is fairly easy to find - a step definition like:
When I click on the related link for the "Make some bread" story
would match both regular expressions, because the first regular expression
doesn’t match against the beginning ^
or end $
of the line. Since the
case
statement just evaluates from the top down, the first match wins.
How did I find that? I was adding a new piece of functionality to click on the related link, wrote that new selector, ran the test, and it failed. I wrote my passing implementation. Still failed. I tried it out in my browser, where things worked normally. It took some sleuthing to figure out that the wrong selector was being used, and where it was coming from.
Digging deeper
Some other issues are actually lurking somewhere in the 100+ snipped lines of regexes. Let’s say you add another selector like this:
when /^the comment button for the "([^"]+)" story$/
# ...
If that is listed after the /the "([^"]+)" story/
matcher, it will never be
matched. If you’re not careful, even if you are disciplined about reaching the
red state in TDD while adding the
new feature, this can lead to subtle regressions later down the line. Consider
a link or piece of content that gets moved around on the page: if the broad
regexp still matches it, you could end up with a false positive test.
So, what’s the general issue here? When I think about the set of selectors I
implement, I believe it’s a design drawback to allow developers to add
ambiguously matching selectors. It would be better if the selector_for
method
were to give you early notice of multiple colliding matches.
What might such an implementation look like? Well, if we used a construct other
than the case
statement, then selector_for
could evaluate all possible
matches, and raise an exception if there are multiple matches.
Consider this example:
def selector_for(locator)
selectors = {
"the page" => "html > body",
/the project dropdown/ => '#project-selection ul',
/the "([^"]+)" story/ => lambda {
story = Story.find_by_title!($1)
"##{dom_id(story)}"
},
/^the related link for the "([^"]+)" story$/ => lambda {
story = Story.find_by_title!($1)
"a#some_selector_for_story_#{story.id}"
}
}
matches = selectors.map do |selector, result|
if locator =~ selector
if result.respond_to?(:call)
result.call
else
result
end
end
end.flatten
raise "Ambiguous matches!" if matches.size > 1
matches.first
end
(This is just an illustration. I’ve no idea what the performance impact is here, or if the above code works. You could smooth out the interface a bit, too.)
So - problem solved! We can identify ambiguous matches. Party hats all around.
But wait, there’s more
But is this the best we can do? The root of ambiguity here is regular expressions, and you, dear astute reader, have recognized that these selectors are being used in Cucumber steps, and not just within step definition implementations. Hopefully, you’re climbing into the commenting cockpit, ready to take aim at the notion of this coupling of front-end implementation to the big-A Acceptance test suite - a notion that has gotten so much blog heat lately.
In fact, selector_for
was removed in cucumber-rails
1.1.0.
For new apps, it’s a good change. And we should upgrade our existing apps, and
generally move away from imperative steps if we’re using them. But is there
still something to selector_for
that’s worth keeping around?
Myself, I’ve liked using selector_for
in the definitions of acceptance test
steps because it provides a central and well-named lookup for front-end
implementation details. But I don’t like-like it. Really, we’re trading one
set of magic constants (CSS selectors) for another (named selector strings), and
the presence of regexp-based dispatch in the land of plain old Ruby methods
starts to feel a bit out of place. Unnecessary. Even bulky:
When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
click_link(selector_for("the related discussion link for #{story_title}"))
fill_in 'Comment', with: "What fun discourse"
click_button 'Create Discussion'
end
Wouldn’t it be nice to just extract these to plain old Ruby methods?
module HtmlSelectorsHelpers
def related_discussion_link(story_title)
story = Story.find_by_title!(story_title)
"a#some_selector_for_story_#{story.id}"
end
end
World(HtmlSelectorsHelpers)
…
When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
click_link(related_discussion_link(story_title))
fill_in 'Comment', with: "What fun discourse"
click_button 'Create Discussion'
end
Now, we have less clunky dispatch and more cohesive step definitions which operate at a single level of abstraction.
It’s also good because this moves away from encouraging the coupling of Cucumber steps to implementation details. It encourages less string matching, and more plain Ruby method-calling.
But what about our original issue, ambiguity? It’s still possible to overwrite
a method in the HtmlSelectorsHelpers
module, although I suppose it’s less
likely - the real reason I got tripped up at the beginning was that too-general
regular expression. I think we’ve done well to mitigate that source of bugs.
Bringing it on home
The one trouble with this approach is that it’s not incremental. I can’t easily
convert all my existing selector_for
code to use just methods in one fell
swoop. This is the same pain point felt when talking about removing web_steps
from an existing suite - it’s messy and time-intensive to change all that
string-matching lookup into method dispatches. But, worth it? I’d argue that
it is.
If you’re writing Cucumber, do you use selector_for
, spurn it despite its
availability, or use a new version without the method? Do you place your
selectors directly in step definitions? Do you have some other lookup facility?