Reputation: 7976
I'm using rubular.com to build my regex, and their documentation describes the following:
(...) Capture everything enclosed
(a|b) a or b
How can I use an OR expression without capturing what's in it? For example, say I want to capture either "ac" or "bc". I can't use the regex
(a|b)(c)
right? Since then I capture either "a" or "b" in one group and "c" in another, not the same. I know I can filter through the captured results, but that seems like more work...
Am I missing something obvious? I'm using this in Java, if that is pertinent.
Upvotes: 165
Views: 147121
Reputation: 555
If your OR alternatives are all single characters - you can just use "character set" operator:
([ab]c)
it will only match ac
or bc
and it's more readable.
Upvotes: 9
Reputation: 43497
Even rubular doesn't make you use parentheses and the precedence of |
is low. For example a|bc does not match ccc
Upvotes: 3
Reputation: 25293
If your implementation has it, then you can use non-capturing parentheses:
(?:a|b)
Upvotes: 38
Reputation: 655349
Depending on the regular expression implementation you can use so called non-capturing groups with the syntax (?:…)
:
((?:a|b)c)
Here (?:a|b)
is a group but you cannot reference its match. So you can only reference the match of ((?:a|b)c)
that is either ac
or bc
.
Upvotes: 266