vkaul11
vkaul11

Reputation: 4214

can't read quotes correctly with pandas read_csv

I have a file test.tsv with some rows having quotes and it basically skips stops using the new line character as a new row indicator. So if I have a file

" m     1
what does comoda mean   1
the poke co     1
dmf     1
"g      1

and I use

test = pd.read_csv("test.tsv", 
                  sep='\t')

I get the all rows as one row

 m\t1\nwhat does comoda mean\t1\nthe poke co\t1\ndmf\t1\ng  1

I want to keep all rows intact and get the output

" m     1
what does comoda mean   1
the poke co     1
dmf     1
"g      1

Is there a way to solve this double quote issue? I have multiple rows coming out as a single row wherever I have double quotes opened up until there is double quote to close that. After that the rows are interpreted correctly.

Upvotes: 0

Views: 735

Answers (1)

ApproachingDarknessFish
ApproachingDarknessFish

Reputation: 14313

You can control the parsing of quotes using the quoting keyword parameter of pandas.read_csv. In your case you can disable quoting like so:

>>> import pandas as pd
>>> import csv

>>> pd.read_csv("test.tsv", sep='\t', quoting=csv.QUOTE_NONE)                 

                     " m  1
0  what does comoda mean  1
1            the poke co  1
2                    dmf  1
3                     "g  1

Note that the first row is being interpreted as a column header. Pass header=None to prevent that.

Upvotes: 1

Related Questions