user319436
user319436

Reputation: 173

How to read csv lines with pandas containing " and ' between quoting character "?

I'm trying to import csv with pandas read_csv and can't get the lines containing the following snippet to work:

"","",""BSF" code - Intermittant, see notes",""

I am able to get pass it via with the options error_bad_lines=False, low_memory=False, engine='c'. However it should be possible to parse them correctly. I'm not good with regular expressions so I didn't try using engine='python', sep=regex yet. Thanks for any help.

Upvotes: 0

Views: 290

Answers (1)

elzell
elzell

Reputation: 2306

Well, that's quite a hard one ... given that all fields are quoted you could use a regex to only use , followed and preceded by " as a separator:

data = pd.read_csv(filename,sep=r'(?<="),(?=")',quotechar='"')

However, you will still end up with quotes around all fields, but you could fix this by applying

data = data.applymap(lambda s:s[1:-1])

Upvotes: 1

Related Questions