Reputation: 3487
I have problems while reading text from a csv file. An example line from the csv file looks like this:"
1477-7819-4-45-2 Angiolymphatic Invasion (H & E 400 Ã)."
I guess that the problem is the coding of the text, so I decided to change it to ASCII.
This is my python code so far:
text_path = '/some_path/filename.csv'
text_path_ascii = '/some_path/filename_ASCII.csv'
input_codec = 'UTF-16'
output_codec = 'ASCII'
for line in unicode_file:
unicode_data = unicode_file.read().decode(input_codec)
#here is another problem => AttributeError: 'str' object has no attribute 'decode'
unicode_data = unicode_file.read()
ascii_file = open(text_path_ascii, 'w')
ascii_file.write(unicode_data.write(unicode_data.encode(output_codec)))
# same problem=> AttributeError: 'str' object has no attribute 'encode'
ascii_file.write(unicode_data.encode(output_codec))
So my problem is that I don't know how to encode/decode the text.
I am even not sure if this is the right way to handle the wrong written text (yes, the text looks like the given line if you open it with any editor) correcly.
Or is there maybe an easier way to read in the csv text without "broken" characters) directly?
Thanks for Your ideas
Upvotes: 3
Views: 1216
Reputation: 2335
There is no decode
method on str
but it is on bytes
If you want to decode it. You can do it with open
itself.
file = open(filename, mode, encoding='utf-8')
Upvotes: 1