toyotajon93
toyotajon93

Reputation: 37

Python Converting Characters from Unicode to HTML

Hey guys I am trying to convert this in python 2.7.3:

the+c\xf8\xf8n

to the html string:

the+c%C3%B8%C3%B8n

It was original the c\xf8\xf8n but I did use a replace to use a + instead of the space.

I'm not entirely sure what convention the latter is I would use string replace but the convention changes by the different characters..

Thoughts? Thanks guys

Upvotes: 0

Views: 377

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1124668

You are URL encoding, not HTML. Use urllib.quote:

from urllib import quote

but make sure you encode to UTF-8 first:

quote(inputstring.encode('utf8'))

This will quote the + explicitly; if you meant that to be a space character, you need to mark that as safe:

quote(inputstring.encode('utf8'), '+')

The latter form gives:

>>> quote(inputstring.encode('utf8'), '+')
'the+c%C3%B8%C3%B8n'

Upvotes: 1

Related Questions