Reputation: 13
I am trying to match "sentence two foo" and sentence four foo" in the following string:
sentence one foo sentence two foo sentence three foo sentence four foo sentence five
note that each sentence can contain more than one space, but never consecutive spaces and that each sentence is separated from the preceding and following one by at least 2 consecutive spaces
I am using the following pattern for matching:
.*(sentence two.* ).*(sentence four.* )
Note the double space after each of the two sentences.
The problem, as you well know, is that due to the greediness of the matching engine, it will match up to the double space at the end of sentence four. So my first match group(1)
will be more than I want and my second match group(2)
will be empty. What I need is "sentence twofoo" in group(1)
and "sentence four foo" in group(2)
I have read the posts about the non-greedy operator "?" but I'm having problems applying it to the double spaces (which, incidentally, doesn't necessarily have to be double, it can also be three, four, etc.)
I tried:
.*(sentence two.*)( )?.*(sentence four.*)( )?
and taking group(1)
and group(3)
, but it doesn't seem to make any difference...
Any help is greatly appreciated.
Thanks
/Andrea
Upvotes: 1
Views: 107
Reputation:
The non-greedy operator should be applied to the part that grabs the sentences, not the double spaces:
/(sentence two.*?) .*(sentence four.*?)/
(Because you want to match the shortest possible amount of text before encountering a double space)
Upvotes: 1