ArPy
ArPy

Reputation: 15

Issue with headers while splitting CSV file [Python 3]

I am newbie here in StackOverflow, so if made any formal mistakes with that post, correct me pls, would be appreciated! However, coming back to main topic: I have some issues with headers while splitting big CSV file into smaller ones. General idea is to split mentioned file according to the 1 column and create smaller files with column names, for instance:

Fruit       Country       Color
apple       Poland        red
banana      Argentina     yellow
pineapple   Argentina     brown
pear        Poland        green
melon       Turkey        yellow
plum        Poland        violet
peach       Turkey        orange
grenade     Argentina     violet

Code should generate 3 different files (Poland.csv, Turkey.csv, Argentina.csv)

So far I've made the below code which is splitting CSV correctly but cannot append headers properly (they are added through each iteration). Do you have any ideas how can I deal with it?

import csv

opener = open('file.csv', 'r', encoding='utf-8')  
csvreader = csv.reader(opener, delimiter=';')        
header = next(csvreader)

def splitter(u):                                   
    for row in u:
        with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
          writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
          writer.writerow(header)
          writer.writerow(row)

    myfile.close()

splitter(csvreader)

Upvotes: 0

Views: 197

Answers (2)

Mr_Z
Mr_Z

Reputation: 539

This fixes the problem:

import csv

opener = open('file.csv', 'r', encoding='utf-8')  
csvreader = csv.reader(opener, delimiter=';')        
header = next(csvreader)

def splitter(u):
    tableNames = []
    for row in u:
        with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
            writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
            if not row[1] in tableNames:
                writer.writerow(header)
                tableNames.append(row[1])
            writer.writerow(row)

    myfile.close()

splitter(csvreader)

Upvotes: 0

Neil
Neil

Reputation: 3291

Try something like this (quick and dirty but should work):

def splitter(u):    
    filenames_already_opened = []     # Just keep a list of the csv's you've already created and therefore have added a header to.           
    for row in u:
        filename = row[1] + '.csv'
        with open(filename, 'a', encoding='utf-8', newline='') as myfile:
            writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
            if filename in filenames_already_opened:  # Don't add a header if it's already got one.
                pass
            else:
                writer.writerow(header)
                filenames_already_opened.append(filename)
            writer.writerow(row)

    myfile.close()

Upvotes: 1

Related Questions