Reputation: 25356
I've got a string from an HTTP header, but it's been escaped.. what function can I use to unescape it?
myemail%40gmail.com -> [email protected]
Would urllib.unquote() be the way to go?
Upvotes: 20
Views: 21205
Reputation: 195
Small correction to the previous answers (tested with python 3.11) -
from urllib.parse import unquote
unquote('myemail%40gmail.com')
'[email protected]'
Upvotes: 0
Reputation: 134038
In Python 3, these functions are urllib.parse.unquote
and urllib.parse.unquote_plus
.
The latter is used for example for query strings in the HTTP URLs, where the space characters () are traditionally encoded as plus character (
+
), and the +
is percent-encoded to %2B
.
In addition to these there is the unquote_to_bytes
that converts the given encoded string to bytes
, which can be used when the encoding is not known or the encoded data is binary data. However there is no unquote_plus_to_bytes
, if you need it, you can do:
def unquote_plus_to_bytes(s):
if isinstance(s, bytes):
s = s.replace(b'+', b' ')
else:
s = s.replace('+', ' ')
return unquote_to_bytes(s)
More information on whether to use unquote
or unquote_plus
is available at URL encoding the space character: + or %20.
Upvotes: 4
Reputation: 8724
Yes, it appears that urllib.unquote()
accomplishes that task. (I tested it against your example on codepad.)
Upvotes: 2
Reputation: 488664
I am pretty sure that urllib's unquote
is the common way of doing this.
>>> import urllib
>>> urllib.unquote("myemail%40gmail.com")
'[email protected]'
There's also unquote_plus
:
Like unquote(), but also replaces plus signs by spaces, as required for unquoting HTML form values.
Upvotes: 38