Jürgen K.
Jürgen K.

Reputation: 3487

Convert csv text from utf-16 to ascii or read in correctly

I have problems while reading text from a csv file. An example line from the csv file looks like this:"

1477-7819-4-45-2 Angiolymphatic Invasion (H & E 400 Ã)."

I guess that the problem is the coding of the text, so I decided to change it to ASCII.

This is my python code so far:

text_path = '/some_path/filename.csv'
text_path_ascii = '/some_path/filename_ASCII.csv'

input_codec = 'UTF-16'
output_codec = 'ASCII'

for line in unicode_file:
    unicode_data = unicode_file.read().decode(input_codec)
    #here is another problem => AttributeError: 'str' object has no attribute 'decode'
    unicode_data = unicode_file.read()

ascii_file = open(text_path_ascii, 'w')
ascii_file.write(unicode_data.write(unicode_data.encode(output_codec)))
# same problem=> AttributeError: 'str' object has no attribute 'encode'
ascii_file.write(unicode_data.encode(output_codec))

So my problem is that I don't know how to encode/decode the text.

I am even not sure if this is the right way to handle the wrong written text (yes, the text looks like the given line if you open it with any editor) correcly.

Or is there maybe an easier way to read in the csv text without "broken" characters) directly?

Thanks for Your ideas

Upvotes: 3

Views: 1216

Answers (1)

Shreyash S Sarnayak
Shreyash S Sarnayak

Reputation: 2335

There is no decode method on str but it is on bytes

If you want to decode it. You can do it with open itself.

file = open(filename, mode, encoding='utf-8')

Upvotes: 1

Related Questions