Actuary
Actuary

Reputation: 169

Writing csv with quotes around strings (Python)

I have written the following code to take a large csv file, and split it into multiple csv files based on a particular word in a column. The original csv file has some fields that are strings, and they have quotes around them.

For example:

Field1,Field2,Field3,Field4
1,2,"red",3
1,4,"red",4
3,4,"blue",4

etc.

My code splits the file into separate csvs based on Field4.

My output looks like this:

3.csv
Field1,Field2,Field3,Field4
1,2,red,3

4.csv
Field1,Field2,Field3,Field4
1,4,red,4
3,4,blue,4

I want my output to maintain quotes around the strings in field 3. The files are fed into a piece of software that only works if strings have quotes around them, which is quite annoying.

My current code looks like this:

import csv

#Creates empty set - this will be used to store the values that have already been used
newfilelist = set()

#Opens the large csv file in "read" mode
with open('File.csv', 'r') as csvfile:
    
    #Read the first row of the large file and store the whole row as a string (headerstring)
    read_rows = csv.reader(csvfile)
    headerrow = next(read_rows)
    headerstring=','.join(headerrow) 
    for row in read_rows:
        
        #Store the whole row as a string (rowstring)
        rowstring=','.join(row)
        
        #Takes Field 4
        newfilename = (row[3])
        
        
        #This basically makes sure it is not looking at the header row.
        if newfilename != "field4":
        
            
            #If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row.
            if newfilename not in newfilelist:    
                newfilelist.add(newfilename)
                with open('//output/' +str(newfilename)+'.csv','a') as f:
                    f.write(headerstring)
                    f.write("\n")
                    f.close()    
            #If the newfilename is in the newfilelist set, append the current row to the existing csv file.     
            else:
                with open('//output/' +str(newfilename)+'.csv','a') as f:
                    f.write(rowstring)
                    f.write("\n")
                    f.close()
 

Can anybody advise me how to get the quotes around the strings? Unfortunately the software that uses my files requires them to be in this format!

Upvotes: 8

Views: 9199

Answers (2)

WombatPM
WombatPM

Reputation: 2609

CSVwriter might be overkill for what you are trying to do. If you want the entire line unchanged, just write the entire line.

#Creates empty array - this will be used to store the values that have already been used
newfilelist = {}

#Opens the large csv file in "read" mode
with open('File.csv, 'r') as csvfile:

    #Read the first row of the large file and store the whole row as a string (headerstring)
    headerstring = csvfile.readline()
    for row in csvfile.readlines():

        #Takes Field 4
        newfilename = row.split(',')[3].strip('"')

        #If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row.
        if newfilename not in newfilelist:    
            newfilelist[newfilename] = open('//output/' +str(newfilename)+'.csv','w'):  #open a file and store the file reference in an dictionary
            newfilelist[newfilename].write(headerstring)

        newfilelist[newfilename].write(row)  # Write out a row to an existing file

#Close all open files
for k in newfilelist.keys():
    newfilelist[k].close()

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798536

Pass quoting=csv.QUOTE_NONNUMERIC to csv.writer().

Upvotes: 17

Related Questions