Reputation: 709
Disclosure: very much a regex newbie, so I'm trying to tweak some example code I found which parses web server log data into named groups. The snippet of my modified regex thus far that deals with the URL and query string groups:
(?P<url>.+)(?P<querystr>\?.*)
This works just fine when the string against which it's applied actually does have a query string on the URL (each group gets the expected bit of the string) but fails to match if there is none. So I tried adding a '?' after the "querystr" group to indicate that it was optional, i.e. (?P<querystr>\?.*)?
... if there's no query string then it works as expected (nothing is extracted into querystr), but when there is one, it is still extracted as part of url rather than separately into querystr.
What's the best way to identify optional groups (assuming that's even the right approach in this case)? Thanks in advance.
Upvotes: 1
Views: 752
Reputation: 626936
You can use
^(?P<url>[^?]+)(?P<querystr>\?.*)?$
Details
^
- start of string(?P<url>[^?]+)
- Group "url": any one or more chars other than ?
(?P<querystr>\?.*)?
- an optional Group "querystr": a ?
char and then any zero or more chars other than line break chars as many as possible$
- end of string.See the regex demo.
Upvotes: 1