John Doe
John Doe

Reputation: 191

How to eliminate strangely inserted quotes from a csv in Python

I have a list with inconsistently placed quotations in Python, and I am trying to eliminate them. Something like the below works fine if there is only one set of double quotes per line in the csv file, but it gets thrown off if there are multiple sets (such as the fourth line (third line of data after the header)

I have tried a number of different methods, but I always seem to end up with the elements combined incorrectly.

Sample csv:

First,Nickname,Last,Sport
Bill,Bats,Smith,Baseball
Tom,Kicks,Johnson,Soccer
"John,"Footy",Jacobsen,Football"
Mike,"Mikey",Jones,Basketball

My Code:

import csv
with open('fake.csv', mode='r', encoding = 'utf-8') as infile:
    reader = csv.reader(infile)
    for line in reader:
     if len(line) <4:
        for i in range(0,len(line)):
         line[i].strip('"')
         line[i].replace('"', '')
     print(line)
     print(line[0] + line[2])

Desired output:

['First', 'Nickname', 'Last', 'Sport']
FirstLast
['Bill', 'Bats', 'Smith', 'Baseball']
BillSmith
['Tom', 'Kicks', 'Johnson', 'Soccer']
TomJohnson
['John','Footy', 'Jacobsen', 'Football']
JohnJacobsen
['Mike', 'Mikey', 'Jones', 'Basketball']
MikeJones

My Output:

['First', 'Nickname', 'Last', 'Sport']
FirstLast
['Bill', 'Bats', 'Smith', 'Baseball']
BillSmith
['Tom', 'Kicks', 'Johnson', 'Soccer']
TomJohnson
['John,Footy"', 'Jacobsen', 'Football"']
John,Footy"Football"
['Mike', 'Mikey', 'Jones', 'Basketball']
MikeJones

Any help would be appreciated

Upvotes: 0

Views: 558

Answers (1)

Peter DeGlopper
Peter DeGlopper

Reputation: 37319

The reader will be expecting the quote characters to wrap entries that contain your delimiter, so it's working as expected. If your input contains unbalanced or inaccurate quoting, as in this example, one option is to tell the reader not to treat quotes specially at all:

reader = csv.reader(infile, quoting=csv.QUOTE_NONE)

You'd then have to process quotes yourself, so this is not the best choice if your input is consistently quoted.

Upvotes: 2

Related Questions