Reputation: 2733
I am trying to find word "the"
that has space before character "t" and after character "e" from string "the the the the" . i am using below regular expression but it is giving me only one word "the"
instead of two word 'the'
.
s="the the the the"
s.scan(/\sthe\s/)
output - [" the "]
I was expecting expression to return tow middle word "the". why this is happening.
Upvotes: 1
Views: 63
Reputation: 627507
The problem here is that \s
patterns consume the whitespace. The scan
method only matches non-overlapping matches, and your expected matches are overlapping.
You need to use looakrounds to get overlapping matches:
/(?<=\s)the(?=\s)/
See the regex demo and a Ruby demo where puts s.scan(/(?<=\s)the(?=\s)/)
prints 2 the
instances.
Pattern details:
(?<=\s)
- a positive lookbehind that requires a whitespace to be present immediately before the the
the
- a literal text the
(?=\s)
- a positive lookahead that requires a whitespace right after the the
.Note that if you use \bthe\b
(i.e. use word boundaries), you will get all the
instances from your string as \b
just asserts the position before or after a word char (letter, digit or underscore).
Upvotes: 1