Reputation: 147
I need you help to validate a regular expression link in Python. Here's how should the link look like:
http://www.example.com?utm_source=something&utm_medium=somethingelse
And I've tried something like:
r'^\?utm_source\=(?P<utm_source>[-\w]+)&utm_medium\=(?P<utm_medium>[-\w]+)/$'
But this doesn't work. Can you please help me? Which (other) characters should be escaped?
Upvotes: 0
Views: 717
Reputation: 141770
This is a classic XY problem.
Tim's answer, gives you the solution you asked for.
I'd suggest that you do not need regular expressions here at all if all you want to do is validate a query string.
Take a look at urlparse
...
>>> a_url = 'http://www.example.com?utm_source=something&utm_medium=somethingelse'
>>> parser = urlparse.urlparse(a_url)
>>> qs = urlparse.parse_qs(parser.query)
>>> 'utm_medium' in qs
True
>>> len(qs['utm_medium']) == 1
True
>>> qs['utm_medium'][0].isalpha()
True
>>> 'utm_source' in qs
True
>>> len(qs['utm_source']) == 1
True
>>> qs['utm_source'][0].isalpha()
True
>>> 'utm_zone' in qs
False
Upvotes: 4
Reputation: 336078
You don't need all those escapes:
r'^\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/$'
Then, your regex only matches a complete string; it won't find a sub-match, so perhaps you need to remove the anchors?
r'\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/'
Finally, the slash at the end is required in the regex but missing from your example string. So how about
r'\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/?'
Upvotes: 1