Ben Jeffrey
Ben Jeffrey

Reputation: 951

Removing square brackets and quotation marks which are added to rows in a csv file when it is opened in python?

I'm trying to write a script that will open a csv file, find every row in that csv which has a value at index 5 that matches a value in a list called "matches", and append that row to a csv which is named after the value at index 5. So far I have this:

import csv

def write_to_file(x, y):
    f = open(y + ".csv", 'a',) 
    f.write(x + '\n')
    f.close()

with open('/Users/bj5/Desktop/test1.csv', 'rb') as a:
    reader = csv.reader(a)
    for row in reader:
        name = row[5]
        if name in matches:
            row = str(row)
            write_to_file(row, name)

Which sort of does the job, the problem is that each row is encapsulated within square brackets and each cell within speech marks:

['metadata_cleaned_29042013', '23541540', 'ERS075571', '', '36809', 'Mycobacterium_abscessus', '', '2010', '3', '', '', 'UK', 'Papworth Everard', '', '', '5.0M']
['metadata_cleaned_29042013', '23541540', 'ERS075570', '', '36809', 'Mycobacterium_abscessus', '', '2009', '7', '', '', 'UK', 'Papworth Everard', '', '', '5.1M']
['metadata_cleaned_29042013', '23541540', 'ERS075569', '', '36809', 'Mycobacterium_abscessus', '', '2010', '3', '', '', 'UK', 'Papworth Everard', '', '', '5.0M']

I want to get rid of this so each output file will look like this:

metadata_cleaned_29042013   23541540    ERS075571       36809   Mycobacterium_abscessus     2010    3           UK  Papworth Everard            5.0M
metadata_cleaned_29042013   23541540    ERS075570       36809   Mycobacterium_abscessus     2009    7           UK  Papworth Everard            5.1M
metadata_cleaned_29042013   23541540    ERS075569       36809   Mycobacterium_abscessus     2010    3           UK  Papworth Everard            5.0M

I thought this might work but it didn't change anything:

 with open('/Users/bj5/Desktop/test1.csv', 'rb') as a:
        reader = csv.reader(a)
        for row in reader:
            name = row[5]
            if name in matches:
                row = str(row)
                row.replace("[",'')
                row.replace("]",'')
                row.replace("''",'')
                write_to_file(row, name)

Can anyone tell me why this didn't work and what I can do to get rid of the square brackets and quotation marks?

Upvotes: 1

Views: 2978

Answers (2)

Daniel Roseman
Daniel Roseman

Reputation: 599630

You're thinking about this wrong. You use csv.reader to read the CSV file and convert it into a Python data structure. So you should be using csv.writer to convert it back from a Python data structure into a CSV, rather than writing the raw data to a file.

You also shouldn't be reopening the file each time; there's no need for that.

with open('/Users/bj5/Desktop/test1.csv', 'rb') as a:
  with open(name, 'w') as dest:
    reader = csv.reader(a)
    writer = csv.writer(dest)
    for row in reader:
        name = row[5]
        if name in matches:
            writer.writerow(row)

Upvotes: 1

bruno desthuilliers
bruno desthuilliers

Reputation: 77912

What's wrong with the csv.writer actually ?

with open('/path/to/source.csv', 'rb') as source, open('/path/to/dest.csv','wb') as dest:
    reader = csv.reader(source)
    writer = csv.writer(dest)
    for row in reader:
       if row[5] in matches:
           writer.writerow(row)

Upvotes: 1

Related Questions