Convert csv text from utf-16 to ascii or read in correctly

Question

I have problems while reading text from a csv file. An example line from the csv file looks like this:"

1477-7819-4-45-2 Angiolymphatic Invasion (H & E 400 Ã)."

I guess that the problem is the coding of the text, so I decided to change it to ASCII.

This is my python code so far:

text_path = '/some_path/filename.csv'
text_path_ascii = '/some_path/filename_ASCII.csv'

input_codec = 'UTF-16'
output_codec = 'ASCII'

for line in unicode_file:
    unicode_data = unicode_file.read().decode(input_codec)
    #here is another problem => AttributeError: 'str' object has no attribute 'decode'
    unicode_data = unicode_file.read()

ascii_file = open(text_path_ascii, 'w')
ascii_file.write(unicode_data.write(unicode_data.encode(output_codec)))
# same problem=> AttributeError: 'str' object has no attribute 'encode'
ascii_file.write(unicode_data.encode(output_codec))

So my problem is that I don't know how to encode/decode the text.

I am even not sure if this is the right way to handle the wrong written text (yes, the text looks like the given line if you open it with any editor) correcly.

Or is there maybe an easier way to read in the csv text without "broken" characters) directly?

Thanks for Your ideas

Convert csv text from utf-16 to ascii or read in correctly

Answers (1)

Related Questions