wuno
wuno

Reputation: 9885

Updating Query In URL With Urlib In Python

I have a url that is being parsed out of an XML file.

product_url = urlparse(item.find('product_url').text)

When I use urlib to break the url up I get this,

ParseResult(scheme='http', netloc='example.com', path='/dynamic', params='', query='t=MD5-YOUR-OAUTH-TOKEN&p=11111111', fragment='')

I need to update the

MD5-YOUR-OAUTH-TOKEN part of the query with a MD5 Hashed Oauth Key.

Which I have in this tokenHashed = encryptMd5Hash(token)

My goal is to after it is parsed and the hash has been inserted to the string in place of the MD5-YOUR-OAUTH-TOKEN, to have the whole url in a string I can use somewhere else. Originally I was trying to use regex to do this but found urlib. I cannot find where it says to do something like this?

Am I right to be using urlib for this? How do I achieve my goal of updating the url with the hashed token and having the whole url stored in a string?

So the string should look like this,

newString = 'http://example.com/dynamic?t='+tokenHashed+'&p=11112311312'

Upvotes: 0

Views: 776

Answers (1)

larsks
larsks

Reputation: 312322

You'll first want to use the parse_qs function to parse the query string into a dictionary:

>>> import urlparse
>>> import urllib
>>> url = 'http://example.com/dynamic?t=MD5-YOUR-OAUTH-TOKEN&p=11111111'
>>> parsed = urlparse.urlparse(url)
>>> parsed
ParseResult(scheme='http', netloc='example.com', path='/dynamic', params='', query='t=MD5-YOUR-OAUTH-TOKEN&p=11111111', fragment='')
>>> qs = urlparse.parse_qs(parsed.query)
>>> qs
{'p': ['11111111'], 't': ['MD5-YOUR-OAUTH-TOKEN']}
>>> 

Now you can modify the dictionary as desired:

>>> qs['t'] = ['tokenHashed']

Note here that because the parse_qs returned lists for each query parameter, we need replace them with lists because we'll be calling urlencode next with doseq=1 to handle those lists.

Next, rebuild the query string:

>>> newqs = urllib.urlencode(qs, doseq=1)
>>> newqs
'p=11111111&t=tokenHashed'

And then reassemble the URL:

>>> newurl = urlparse.urlunparse(
... [newqs if i == 4 else x for i,x in enumerate(parsed)])
>>> newurl
'http://example.com/dynamic?p=11111111&t=tokenHashed'

That list comprehension there is just using all the values from parsed except for item 4, which we are replacing with our new query string.

Upvotes: 3

Related Questions