Reputation: 13
Here is my regexp to find several begin of words, separated by a fixed number of other words (here 0 or 1):
\b(Word1.*\b(?:\w{0,1})\bWord2.*\b(?:\w{0,1})\bWord3.*)\b
And here the text, with between brackets the block I want to find:
Now, is the result of 'match' operation in the same text. What I Want is between [], and what the c# RegExp engine find is between {}:
Wanted and Found:
1) aaa {[Word1 Word2 Word3] bbb}
2) aaa {[Word1 xxx Word2 xxx Word3] bbb}
3) aaa {[Word1nn xxx Word2nn xxx Word3nn] bbb}
Not wanted and not found:
4) aaa mmWord1nn xxx mmWord2nn xxx mmWord3nn bbb
Not wanted and found :
5) aaa {Word1nn xxx xxx Word2nn xxx Word3nn bbb}
6) aaa {Word1nn xxx xx Word2nn xxx xxx Word3nn bbb}
7) aaa {Word1 xxx Word2 xxx xxx Word3 bbb}
8) aaa {Word1 xxx xx Word2 xxx xxx Word3 bbb}
So, my problems are:
Any solutions?
Upvotes: 0
Views: 34
Reputation: 71588
You could use a regex like this which should work on most languages:
\b(Word1\S* (?:\S+ )?Word2\S* (?:\S+ )?Word3\S*)
Notes:
\w
matches a single (word) character from the (rough) character class [A-Za-z0-9]
, and not a word. Use \S+
to mean a word (a series of non space characters).
Use \S*
instead of .*
because .
will match spaces too.
Upvotes: 2