Regular Expressions

Flashcard 3 of 10

The following sample text contains keywords wrapped in a pair of % characters. Our goal is to slice out each of these keywords, collected in an array.

string = "The %start% and %end% of the content..."

string.scan(/%.*%/)
#=> ["%start% and %end%"]

Unfortunately, it looks like our pattern is grabbing one long string rather than the two distinct matches we were hoping to get.

How can we alter our pattern to fix this issue?

We can use .*? in place of .* in our pattern to make it "non-greedy".

The .* portion of the pattern will match as many characters as possible before matching the next part of our pattern, while .*? will match as few as possible.

string = "The %start% and %end% of the content..."

string.scan(/%.*?%/)
#=> ["%start%", "%end%"]

Each of the quantifiers (+, ?, and *) can be made non-greedy by adding ? after the quantifier.

Check out the [ruby-doc section on regex repetition][] for more detail on the repetition quantifiers for greedy and non-greedy matching.

[ruby-doc section on regex repetition]: http://ruby-doc.org/core-2.1.0/Regexp.html#class-Regexp-label-Repetition

Return to Flashcard Results