Dror
Dror

Reputation: 2588

C# Regex.Matches - Previous matched characters can not be matched in the next matches

I got the following regex call -

MatchCollection matches = Regex.Matches(text,@"( And )|( Or )|( Not )"

I got a problem with a string like this - " And Or Not "

Only "And" will be matched but "Or" and "Not" will not be, just because they are not the first word.

The reason as far as I understand is because the first match is " And " including the trailing white space, because of that the Regex does not recognize it as a potential white space for the next match, and ignores it, just because it was part of the first match.

So if for example this was my string instead - " And Or Not " - every word would have been matched.

Is there a way to somehow instruct the Regex to share the matched white-spaces between the matches?

Thanks!

Upvotes: 2

Views: 199

Answers (3)

Damien_The_Unbeliever
Damien_The_Unbeliever

Reputation: 239646

Instead of looking explicitly for whitespace, you should look for word boundaries.

Just had to go look it up, but apparently, \b would be what you're looking for, e.g.:

@"(\bAnd\b)|(\bOr\b)|(\bNot\b)"

(Or, as @stema points out):

@"\b(And|Or|Not)\b"

Upvotes: 4

Paolo Tedesco
Paolo Tedesco

Reputation: 57192

I would simplify the expression a little and use a look-ahead assertion (match something but don't make it part of the capture):

string text = " And Or Not ";
foreach (Match m in Regex.Matches(text, @"\s(And|Or|Not)(?=\s)")) {
    Console.WriteLine(m.Value);
}

(note: I'm using \s instead of spaces)

Upvotes: 1

stema
stema

Reputation: 92976

The problem is, if you have matched a whitespace, the regex is continuing after the last match, the withespace is so to say "gone", because it has already been matched.

What you can do is to use a lookahead, like this:

MatchCollection matches = Regex.Matches(s, @" (?:And|Or|Not)(?= )");

The lookahead is not matching the space, it is just looking ahead, if there is a space following. The expression will not match, if there is no space.
But the result in your MatchCollection will not have this space at the end!

Upvotes: 2

Related Questions