Reputation: 13
My csv file looks like this:
"City","Name","Comment"
"A","Jay","Like it"
"B","Rosy","Well, good"
...
"K","Anna","Works "fine""
The expected output(dataframe):
City,Name,Comment
A,Jay,'Like it'
B,Rosy,'Well, good'
...
K,Anna,'Works "fine"'
I am trying to read it by doing this :
df=pd.read_csv("test.csv", sep=',', engine='python',encoding='utf8', quoting=csv.QUOTE_ALL)
And it is giving error like this :
ParserError: unexpected end of data
I need to have all the data. So I can not skip any lines with error_bad_lines=True
.
How I can fix this issue?
UPDATE: It turns out my original CSV file is missing a quote at the end of the file. I solved the problem by identifying the errors in the file and modifying them.
Upvotes: 0
Views: 3097
Reputation: 84
I believe the trick is to preprocess and then read the data
import re
from io import StringIO
import pandas as pd
data = """
"City","Name","Comment"
"A","Jay","Like it"
"B","Rosy","Well, good"
"K","Anna","Works "fine""
"""
data = re.sub('(?<!^)"(?!,")(?<!,")(?!$)', '\\"', data, flags=re.M)
x = pd.read_csv(StringIO(data), escapechar='\\')
print(x)
Outputs
City Name Comment
0 A Jay Like it
1 B Rosy Well, good
2 K Anna Works "fine"
And in theory this should work the same with the file
with open('test.csv', 'r') as f:
data = re.sub('(?<!^)"(?!,")(?<!,")(?!$)', '\\"', f.read(), flags=re.M)
df = pd.read_csv(StringIO(data), escapechar='\\')
print(df)
Edit : It outputs as following
City Name Comment
0 A Jay Like it
1 B Rosy Well, good
2 K Anna Works "fine"
From kayoz answer
Edit 2: and the last column is easy to change with a lambda or similar function.
df['Comment'] = df['Comment'].apply(lambda x: "'" + str(x) + "'")
TL;DR
import re
from io import StringIO
import pandas as pd
with open('test.csv', 'r') as f:
data = re.sub('(?<!^)"(?!,")(?<!,")(?!$)', '\\"', f.read(), flags=re.M)
df = pd.read_csv(StringIO(data), escapechar='\\')
df['Comment'] = df['Comment'].apply(lambda x: "'" + str(x) + "'")
print(df)
City Name Comment
0 A Jay 'Like it'
1 B Rosy 'Well, good'
2 K Anna 'Works "fine"'
Upvotes: 1