Lance
Lance

Reputation: 123

Encoding/Decoding Unicode and Writing CSV

I am trying to write words of non-latin based languages to a CSV and cannot get the words to be written in their proper form.

foreign='а также'
with open('C:\\Users\\Lance\\Desktop\\Programs\\Database Builder\\Russian Test.csv', 'wb') as outfile:
    outfile.write((foreign).encode('utf-8'))

The output of this code is:

а также

Thanks!

Upvotes: 0

Views: 2500

Answers (3)

Mel
Mel

Reputation: 1

First install unicodecsv

pip install unicodecsv

Then import it in your script

import unicodecsv as csv

Worked for me.

Upvotes: 0

nicky_s
nicky_s

Reputation: 39

firstly, writing data into csv file depends on csv library, the correct script should be:

import csv
with open('path/to/test.csv', 'wb') as f:
    writer = csv.writer(f)
    for line in <your_data>:
       writer.writerow(line)

secondly, as csv library does not support unicode in python 2x, you need use alternative that handle unicode very well -- https://github.com/jdunck/python-unicodecsv, All you have to do is simply installing unicode version of csv library, and adding short import expression at first line:

pip install unicodecsv
import unicodecsv as csv
...

Remember that convert all your strings into unicode by adding 'u' in front of each string.

Upvotes: -1

Mark Tolonen
Mark Tolonen

Reputation: 178179

It writes the file correctly but you are probably displaying the file using an editor or console that is using Windows-1252 encoding.

Example from US Windows cmd.exe console:

C:\>type "Russian Test.csv"
а также
C:\>chcp 1252
Active code page: 1252

C:\>type "Russian Test.csv"
а также
C:\>chcp 65001
Active code page: 65001

C:\>type "Russian Test.csv"
а также

Note: code page 65001 is UTF-8 encoding on Windows.

Since you seem to be using Python 3, you should do this instead and write Unicode strings directly:

foreign='а также'
with open('Russian Test.csv', 'w', encoding='utf8') as outfile:
    outfile.write(foreign)

Upvotes: 3

Related Questions