user823148
user823148

Reputation: 147

Django URLs regex

I need you help to validate a regular expression link in Python. Here's how should the link look like:

http://www.example.com?utm_source=something&utm_medium=somethingelse

And I've tried something like:

r'^\?utm_source\=(?P<utm_source>[-\w]+)&utm_medium\=(?P<utm_medium>[-\w]+)/$'

But this doesn't work. Can you please help me? Which (other) characters should be escaped?

Upvotes: 0

Views: 717

Answers (2)

johnsyweb
johnsyweb

Reputation: 141770

This is a classic XY problem.

Tim's answer, gives you the solution you asked for.

I'd suggest that you do not need regular expressions here at all if all you want to do is validate a query string.

Take a look at urlparse...

>>> a_url = 'http://www.example.com?utm_source=something&utm_medium=somethingelse'
>>> parser = urlparse.urlparse(a_url)
>>> qs = urlparse.parse_qs(parser.query)
>>> 'utm_medium' in qs
True
>>> len(qs['utm_medium']) == 1
True
>>> qs['utm_medium'][0].isalpha()
True
>>> 'utm_source' in qs
True
>>> len(qs['utm_source']) == 1
True
>>> qs['utm_source'][0].isalpha()
True
>>> 'utm_zone' in qs
False

Upvotes: 4

Tim Pietzcker
Tim Pietzcker

Reputation: 336078

You don't need all those escapes:

r'^\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/$'

Then, your regex only matches a complete string; it won't find a sub-match, so perhaps you need to remove the anchors?

r'\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/'

Finally, the slash at the end is required in the regex but missing from your example string. So how about

r'\?utm_source=(?P<utm_source>[-\w]+)&utm_medium=(?P<utm_medium>[-\w]+)/?'

Upvotes: 1

Related Questions