user1292883
user1292883

Reputation: 29

Python - character encoding and decoding problems

And I gave this error:

Traceback (most recent call last):
  File "C:\Users\Rendszergazda\workspace\achievements\hiba.py", line 9, in <module>
    s = str(urlopen("http://eu.battle.net/wow/en/character/arathor/"+str(names[0])+"/achievement").read(), encoding='utf-8')
  File "C:\Python27\lib\encodings\cp1250.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character u'\ufeff' in position 0: character maps to <undefined>

What do you think? What is my problem?

from urllib import urlopen
import codecs

result = codecs.open("C:\Users\Desktop\Achievements\Result.txt", "a", "utf-8")
fh = codecs.open("C:\Users\Desktop\Achievements\FriendsNames.txt", "r", "utf-8")
line = fh.readline()
names = line.split(" ")
fh.close()

s = urlopen("http://eu.battle.net/wow/en/character/arathor/"+str(names[0])+"/achievement").read(), encoding='utf8')
result.write(str(s))
result.close()

Upvotes: 2

Views: 4604

Answers (1)

Thomas Wouters
Thomas Wouters

Reputation: 133503

The problem you're having is that you're calling str(array[0]), where array[0] is a unicode string. This means it'll be encoded in the default encoding, which for some reason in your case seems to be cp1250. (Did you mess with sys.setdefaultencoding()? Don't do that.)

To get bytestrings out of unicode, you should explicitly encode the unicode. Don't just call str() on it. Encode it using the encoding the result should have (which in the case of URLs is somewhat difficult to guess at, but in this case is probably UTF-8.) So, use `array[0].encode('utf-8')'. You may also need to quote the non-ASCII characters in your URL, although that depends on what the remote end expects.

Upvotes: 2

Related Questions