user2390206
user2390206

Reputation: 113

Ignoring certain characters while looping through CSV rows

Using this code to try and print each row in a csv:

import csv

f = open('export.csv')
csv_f = csvkit.reader(f)

for row in csv_f:
    print(row)

Unfortunately, the csv file contains the character ® in multiple lines, which results in the following error:

UnicodeEncodeError: 'charmap' codec can't encode character '\xae' in position 27: character maps to <undefined>

I have searched through other answers to similar problems and tried using different encodings, but unfortunately can't quite wrap my head around it enough to make it work. The CSV file seems to be in UTF-8 format, or at least that's what OpenOffice Calc says when I open up the file in Windows.

Is there any way for me to print the rows while "ignoring" the ® character so that no error is returned? Any alternative solutions would be greatly appreciated, too.

Upvotes: 2

Views: 162

Answers (1)

Jean-Fran&#231;ois Fabre
Jean-Fran&#231;ois Fabre

Reputation: 140266

If you want to filter some "unprintable/weird" chars you can do this:

row = ["aaaaa \xae bbbbb","foo"]

filtered_row = ["".join(c if ord(c)<128 else "." for c in s) for s in row]
print(filtered_row)

result (all strange chars have been replaced by dots):

['aaaaa . bbbbb', 'foo']

Upvotes: 1

Related Questions