The following sample text contains keywords wrapped in a pair of %
characters. Our goal is to slice out each of these keywords, collected in an
array.
string = "The %start% and %end% of the content..."
string.scan(/%.*%/)
#=> ["%start% and %end%"]
Unfortunately, it looks like our pattern is grabbing one long string rather than the two distinct matches we were hoping to get.
How can we alter our pattern to fix this issue?
We can use .*?
in place of .*
in our pattern to make it "non-greedy".
The .*
portion of the pattern will match as many characters as possible
before matching the next part of our pattern, while .*?
will match as few
as possible.
string = "The %start% and %end% of the content..."
string.scan(/%.*?%/)
#=> ["%start%", "%end%"]
Each of the quantifiers (+
, ?
, and *
) can be made non-greedy by adding
?
after the quantifier.
Check out the [ruby-doc section on regex repetition][] for more detail on the repetition quantifiers for greedy and non-greedy matching.
[ruby-doc section on regex repetition]: http://ruby-doc.org/core-2.1.0/Regexp.html#class-Regexp-label-Repetition
Return to Flashcard Results