How to import csv files in Python containing badly formatted quote marks?

Question

I'm trying to load the following test.csv file:

R1C1    R1C2    R1C3
R2C1    R2C2    R2C3
R3C1    "R3C2   R3C3
R4C1    R4C2    R4C3

... Using this Python script :

import csv


 with open("test.csv") as f:
      for row in csv.reader(f, delimiter='	'):
          print(row)

The result I got was the following :

['R1C1', 'R1C2', 'R1C3']
['R2C1', 'R2C2', 'R2C3']
['R3C1', 'R3C2	R3C3
R4C1	R4C2	R4C3
']

It turns out that when Python finds a field whose first character is a quotation mark and there is no closing quotation mark, it will include all of the following content as part of the same field.

My question: What is the best approach for all rows in the file to be read properly? Please consider I'm using Python 3.8.5 and the script should be able to read huge files (2gb or more), so memory usage and performance issues should be also considered.

How to import csv files in Python containing badly formatted quote marks?

Answers (1)

Related Questions