francois.piondrancy
francois.piondrancy

Reputation: 13

find several words beginning spaced by a maximum number of other words

Here is my regexp to find several begin of words, separated by a fixed number of other words (here 0 or 1):

\b(Word1.*\b(?:\w{0,1})\bWord2.*\b(?:\w{0,1})\bWord3.*)\b

And here the text, with between brackets the block I want to find:

Now, is the result of 'match' operation in the same text. What I Want is between [], and what the c# RegExp engine find is between {}:

Wanted and Found:

1) aaa {[Word1 Word2 Word3] bbb}
2) aaa {[Word1 xxx Word2 xxx Word3] bbb}
3) aaa {[Word1nn xxx Word2nn xxx Word3nn] bbb}

Not wanted and not found:

4) aaa mmWord1nn xxx mmWord2nn xxx mmWord3nn bbb

Not wanted and found :

5) aaa {Word1nn xxx xxx Word2nn xxx Word3nn bbb}
6) aaa {Word1nn xxx xx Word2nn xxx xxx Word3nn bbb}
7) aaa {Word1 xxx Word2 xxx xxx Word3 bbb}
8) aaa {Word1 xxx xx Word2 xxx xxx Word3 bbb}

So, my problems are:

Any solutions?

Upvotes: 0

Views: 34

Answers (1)

Jerry
Jerry

Reputation: 71588

You could use a regex like this which should work on most languages:

\b(Word1\S* (?:\S+ )?Word2\S* (?:\S+ )?Word3\S*)

regex101 demo

Notes:

\w matches a single (word) character from the (rough) character class [A-Za-z0-9], and not a word. Use \S+ to mean a word (a series of non space characters).

Use \S* instead of .* because . will match spaces too.

Upvotes: 2

Related Questions