agentkirb87
agentkirb87

Reputation: 3

How to use non capture groups with the "or" character in regular expressions

So basically I have this giant regular expression pattern, and somewhere in the middle of it is the expression (?:\s(\d\d\d)|(\d\d\d\d)). At this part of the parse I'm wanting to capture either 3 digits that follows a space or 4 digits, but I don't want the capture that comes from using the parenthesis around the whole thing (doesn't ?: make something non-capture). I have to use parenthesis so that the "or" logic works (I think).

So potential example inputs would be something like...

I tried (?:\s(\d\d\d)|(\d\d\d\d)) and it gave an extra capture at least in the case where I have 4 digits. So am I doing this right or am I messed up somewhere?

Edit:

To go into more detail... here's the current regular expression I'm working with.

pattern = @".?(\d{1,2})\s*(\w{2}).?.?.?(?:\s(\d\d\d)|(\d\d\d\d)).*"

There's a bit of parsing I have to do at the beginning. I think Sean Johnson's answer would still work because I wouldn't need to use "or". But is there a way to do it in which you DO use "or"? I think eventually I'll need that capability.

Upvotes: 0

Views: 272

Answers (1)

Sean Johnson
Sean Johnson

Reputation: 5607

This should work:

(?:\s(\d{3,4}))

If you aren't doing any logic on that subpattern, you don't even need the parenthesis surrounding it if all you want to do is capture the digits. The following pattern:

\s(\d{3,4})

will capture three or four digits directly following a space character.

Upvotes: 1

Related Questions