Reputation: 31471
Python has a terrific urlencode() function which encodes a dict
via RFC 1738 (Plus encoding):
>>> urllib.parse.urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'Coder=Jeff+Atwood&site=Stack+Overflow'
I cannot find a replacement that uses RFC 3986 (Percent encoding), even though the fine manual states the following:
RFC 3986 - Uniform Resource Identifiers
This is the current standard (STD66). Any changes to urllib.parse module should conform to this.
This would be the expected output:
>>> urllib.parse.urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'Coder=Jeff%20Atwood&site=Stack%20Overflow'
Of course I could roll my own, but I find it surprising that I can find no such Python function built in. Is there such a Python function that I'm just not finding?
Upvotes: 4
Views: 3848
Reputation: 51
For strings you can use this:
def percent_encoding(string):
result = ''
accepted = [c for c in 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._~'.encode('utf-8')]
for char in string.encode('utf-8'):
result += chr(char) if char in accepted else '%{}'.format(hex(char)[2:]).upper()
return result
>>> percent_encoding('http://www.google.com')
'http%3A%2F%2Fwww.google.com'
>>> percent_encoding('ñapa')
'%C3%B1apa'
And now, for a dictionary, you need to encode the values, so you only need a function that translate this dictionary to url key/value pairs, encoding only its values.
def percent_urlencode(dictionary):
return '&'.join(["{}={}".format(k, percent_encoding(str(v))) for k, v in dictionary.items()])
>>> percent_urlencode({'token': '$%&/', 'username': 'me'})
'username=me&token=%24%25%26%2F'
>>> percent_urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'site=Stack%20Overflow&Coder=Jeff%20Atwood'
Upvotes: 1
Reputation: 249123
It seems there is no such thing built in, but there is a bug requesting one, and it even has a patch attached: http://bugs.python.org/issue13866
Upvotes: 4