Reputation: 80
I have the following regular expression:
(?P<a>\d+)\s(?P<b>.+)\s\((?P<c>.+)\)\s(?P<d>\d+)
The strings I'm trying to match are:
"1 Something (code) 123"
Should match as a=1
, b=Something
, c=code
, d=123
"1 Something"
Should match as a=1
, b=Something
(no match for c
or d
)
My expression doesn't match the second string. How can I make the "(code) 123" part optional?
Upvotes: 0
Views: 68
Reputation: 163287
You can use a nested group, and make the part for group d
optional. As Something
is a single word, you can use \S+
instead of .+
(?P<a>\d+)\s(?P<b>\S+)(?:\s\((?P<c>[^()\n]+)\)(?:\s(?P<d>\d+))?)?
The pattern matches:
(?P<a>\d+)\s(?P<b>\S+)
Match group a
, a whitespace char and group b
, where group 2 matches 1+ non whitespace chars instead of .+
(?:
Non capture group
\s\((?P<c>[^()\n]+)\)
Match a whitspace char and group c
(?:\s(?P<d>\d+))?
Optionally match group d
)?
Close the non capture group and make it optionalUpvotes: 1