YTsa
YTsa

Reputation: 55

CSV Reader Removing Double Quotes from First Field

I have a file which contains a tab delimited header and line like so:

ID  Field1
test1   "A","B"

Here's my parsing script.

with open(dataFile) as tsv:
    for line in csv.reader(tsv, delimiter='\t'):
        print(line)

And the output:

['ID', 'Field1']
['test1', 'A,"B"']

I can't figure out why it's stripping the double quotes on the first quoted item of the second field. I've tried different dialects and settings for csv reader with no success.

Upvotes: 2

Views: 1486

Answers (3)

Aysu Sayın
Aysu Sayın

Reputation: 301

The default quote char for csv reader is double quote so it automatically removes them. Changing it to something like '|' will solve your problem. You can do it like this:

with open(dataFile) as tsv:
    for line in csv.reader(tsv, delimiter='\t', quotechar='|'):
        print(line)

From https://docs.python.org/3/library/csv.html#csv.Dialect.quotechar:

Dialect.quotechar

A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to '"'.

EDIT:

Also you can use quoting=csv.QUOTE_NONEoption to disable quoting.

Upvotes: 3

martineau
martineau

Reputation: 123541

You just need to tell the csv.reader to ignore quoting, via the csv.QUOTE_NONE option:

with open(dataFile) as tsv:
    for line in csv.reader(tsv, delimiter='\t', quoting=csv.QUOTE_NONE):
        print(line)

Output:

['ID', 'Field1']
['test1', '"A","B"']

Upvotes: 2

Nick Juelich
Nick Juelich

Reputation: 436

It seems you are delimiting a tab and not actually splitting on the comma, I would change your code to reflect this.

Upvotes: 0

Related Questions