Reputation: 3649
I am trying to extract some alfanumeric expressions out of a longer word in C# using regular expressions. For example I have the word "FooNo12Bee". I use the the following regular expression code, which returns me two matches, "No12" and "No" as results:
alfaNumericWord = "FooNo12Bee";
Match m = Regex.Match(alfaNumericWord, @"(No|Num)\d{1,3}");
If I use the following expression, without paranthesis and without any alternative for "No" it works the way I am expecting, it returns only "No12":
alfaNumericWord = "FooNo12Bee";
Match m = Regex.Match(alfaNumericWord, @"No\d{1,3}");
What is the difference between these two expressions, why using paranthesis results in a redundant result for "No"?
Upvotes: 1
Views: 123
Reputation: 71538
Parenthesis in regex are capture groups; meaning what's in between the paren will be captured and stored as a capture group.
If you don't want a capture group but still need a group for the alternation, use a non-capture group instead; by putting ?:
after the first paren:
Match m = Regex.Match(alfaNumericWord, @"(?:No|Num)\d{1,3}");
Usually, if you don't want to change the regex for some reason, you can simply retrieve the group 0 from the match to get only the whole match (and thus ignore any capture groups); in your case, using m.Groups[0].Value
.
Last, you can improve the efficiency of the regex by a notch using:
Match m = Regex.Match(alfaNumericWord, @"N(?:o|um)\d{1,3}");
Upvotes: 6
Reputation: 14499
It is because the parentheses are creating a group. You can remove the group with ?:
like so
Regex.Match(alfaNumericWord, @"(?:No|Num)\d{1,3}");
Upvotes: 1
Reputation: 4812
i can't explain how they call it, but it is because putting parentheses around it is creating a new group. it is well explained here
Besides grouping part of a regular expression together, parentheses also create a numbered capturing group. It stores the part of the string matched by the part of the regular expression inside the parentheses.
The regex Set(Value)? matches Set or SetValue. In the first case, the first (and only) capturing group remains empty. In the second case, the first capturing group matches Value.
Upvotes: 1