jxn
jxn

Reputation: 8045

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-10: ordinal not in range(128) chinese characters

Im trying to write Chinese characters into a text file from a SQL output called result. result looks like this: [('你好吗', 345re4, '2015-07-20'), ('我很好',45dde2, '2015-07-20').....]

This is my code:

#result is a list of tuples
    file = open("my.txt", "w")
    for row in result:
        print >> file, row[0].encode('utf-8')
    file.close()

row[0] contains Chinese text like this: 你好吗

I also tried:

print >> file, str(row[0]).encode('utf-8')

and

print >> file, 'u'+str(row[0]).encode('utf-8')

but both gave the same error.

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-10: ordinal not in range(128)

Upvotes: 2

Views: 2561

Answers (2)

Dalen
Dalen

Reputation: 4236

Don't forget to ad the UTF8 BOM on the file beginning if you wish to view your file in text editor correctly:

file = open(...)
file.write("\xef\xbb\xbf")
for row in result:
    print >> file, u""+row[0].decode("mbcs").encode("utf-8")
file.close()

I think you'll have to decode from your machines default encoding to unicode(), then encode it as UTF-8.

mbcs represents (at least it did ages a go) default encoding on Windows.

But do not rely on that.

Did you try the codecs module?

Upvotes: -1

jxn
jxn

Reputation: 8045

Found a simple solution instead of doing encoding and decoding by formatting the file to "utf-8" from the beginning using codecs.

import codecs
file = codecs.open("my.txt", "w", "utf-8")

Upvotes: 3

Related Questions