Reputation: 861
So I'm using this line of Python to replace some weird characters in a string;
title = title.replace('\xc3', 'e').replace('\xa9', 's')
The weird string is:
"B\xc3\xa9same Mucho"
It has some Spanish-style accents, and I figured it would be simpler to try to get rid of them instead of trying to implement the accents.
But it doesn't replace the affected parts.
What's wrong with the line?
Thanks!
evamvid
Upvotes: 0
Views: 338
Reputation: 77127
Assuming you're using Python 2.7, you're just having a classic bad encoding day. Python 2 is a little notorious for its Unicode(De|En)codeError. If you really want to replace those characters, observe that:
>>> utitle = title.decode('utf-8')
u'B\xe9same Mucho'
so
>>> utitle.replace(u'\xe9', 'e')
u'Besame Mucho'
But you really want to be dealing with unicode the whole time, and the characters there are really fine, so just do the decode
.
This is one area in which Python 3 is much better than Python 2.
Upvotes: 1