jmk
jmk

Reputation: 486

Writing to csv with 'a' method: headers and new values

I'm opening a csv file with the append method 'a' to write a new row as the result of each calling of a function, using DictWriter (only writing one new row from each call).

keys = ['15000', '15001', '15002', '15003']

values = [18, 20, 3, 9] ##changes each time function run

The values change every time I fun the function. The keys generally stay the same but as I am scraping from an XML file more may be added in future (with correlating new values). I'd therefore like to be able to map the values to the fieldnames consistently.

with open('myfile.csv', 'a') as csvfile:
    w = csv.DictWriter(csvfile, fieldnames = keys, lineterminator='\n')
    w.writerheader()
    w.writerow(dict(zip(fnames, values)))

The problem here is that each time writerheader() writes the header under the previous row. The resultant csv looks like:

15000, 15001, 15002, 15003  
18, 20, 3, 9   
15000, 15001, 15002, 15003  
20, 15, 4, 12

I'd like this not to have the repeated header. I'm aware that I could start off with a csv with only the initial keys as the header, but this doesn't solve my problem if the dictionary keys change on future function calls. Also, if new keys are added is this going to map values to keys consistently? Will I run into problems?

P.S. note that I added lineterminated = '\n' para as the default DictWriter method was creating a blank line on each append.

Upvotes: 2

Views: 1454

Answers (1)

mhawke
mhawke

Reputation: 87084

It sounds like you will need to rewrite the whole CSV file whenever the columns change.

If a new column is appended to the existing keys the current data rows will need to have an empty field appended. Or, instead of an empty field, you might have some default value that should be added.

So, you will need to open the existing CSV file with a DictReader. Get the fieldnames and append the new column(s). Then read each row from the original file, optionally updating the dictionary with a default value for the new column. Then write that to a new file with a DictWriter configured with the new fieldnames. Finally add your new row(s).

import os
import shutil
from csv import DictReader, DictWriter
from tempfile import NamedTemporaryFile

def append_csv(csv_filename, data):
    with open(csv_filename, 'a+') as csv_file:
        reader = DictReader(csv_file)
        new_keys = set(data.keys()).difference(reader.fieldnames)
        if not new_keys:
            csv_file.seek(0, os.SEEK_END)
            writer = DictWriter(csv_file, reader.fieldnames)
            writer.writerow(data)
        else:
            reader.fieldnames.extend(sorted(new_keys))
            with NamedTemporaryFile(dir='.', delete=False) as csv_tmpfile:
                writer = DictWriter(csv_tmpfile, reader.fieldnames, lineterminator='\n')
                writer.writeheader()
                writer.writerows(row for row in reader)
                writer.writerow(data)
            shutil.move(csv_tmpfile.name, csv_filename)

This is going to be very inefficient if the fields change often. Instead you might want to write to a separate CSV file whenever the columns change and then perform a merge at a later time.

Upvotes: 2

Related Questions