chmike
chmike

Reputation: 22174

Convert url encoded string into python unicode string

I have strings encoded in the following form: La+Cit%C3%A9+De+la+West that I stored in a SQLite VARCHAR field in python.

These are apparently UTF-8 encoded binary strings converted to urlencoded strings. The question is how to convert it back to a unicode string. s = 'La+Cit%C3%A9+De+la+West'

I used the urllib.unquote_plus( s ) python function but it doesn't convert the %C3%A9 into a unicode char. I see this 'La Cité De la West' instead of the expected 'La Cité De la West'.

I'm running my code on Ubuntu, not windows and encoding is UTF-8.

Upvotes: 2

Views: 5662

Answers (1)

Dave
Dave

Reputation: 11899

As we discussed, it looks like the problem was that you were starting with a unicode object, not a string. You want a string:

>>> import urllib
>>> s1 = u'La+Cit%C3%A9+De+la+West'
>>> type(s1)
<type 'unicode'>
>>> print urllib.unquote_plus(s1)
La Cité De la West

>>> s2 = str(s1)
>>> type(s2)
<type 'str'>
>>> print urllib.unquote_plus(s2)
La Cité De la West

>>> import sys
>>> sys.stdout.encoding
'UTF-8'

Upvotes: 6

Related Questions