Steve
Steve

Reputation: 331

Keep header while appending to Pandas dataframe w/ Python

So I am parsing data contained in many files, looping through them and storing certain elements in a list, and appending each resulting list to a dataframe with Pandas using Python.

It works, except I can't figure out how to keep the header row while appending. It either disappears or is duplicated with each append.

The below code serves as an example of the latest code:

import pandas as pd

for i in range(1,4):
    data = [{'name': 'Company'+str(i), 'city': 'New York'}]

    stuff = []
    for element in data:
        stuff.append(element)

    df = pd.DataFrame(columns=["name",
                               "city"])

    for record in stuff:
        df = df.append(record, ignore_index=True)

    df.to_csv('test.csv', mode='a', header=False, index=False)

With this code the output (csv file) is:

Company1    New York
Company2    New York
Company3    New York

But I am looking for the output to be:

name        city
Company1    New York
Company2    New York
Company3    New York

Thanks!

Upvotes: 3

Views: 9188

Answers (1)

EdChum
EdChum

Reputation: 394003

But you've set header=False:

df.to_csv('test.csv', mode='a', header=False, index=False)

you should do:

df.to_csv('test.csv', mode='a', header=True, index=False)

You just need to set it to True for the first iteration and then False for the subsequent iterations

Basically you just do something like the following:

# add this outside your for loop
writeHeader = True

# use the following where you write to csv
if writeHeader is True:
    df.to_csv('test.csv', mode='a', header=True, index=False)
    writeHeader = False
else:
    df.to_csv('test.csv', mode='a', header=False, index=False)

or similar

So the complete thing looks like:

import pandas as pd
writeHeader = True

for i in range(1,4):
    data = [{'name': 'Company'+str(i), 'city': 'New York'}]

    stuff = []
    for element in data:
        stuff.append(element)

    df = pd.DataFrame(columns=["name",
                               "city"])

    for record in stuff:
        df = df.append(record, ignore_index=True)

    if writeHeader is True:    
        df.to_csv('test.csv', mode='a', header=True, index=False)
        writeHeader = False
    else:
        df.to_csv('test.csv', mode='a', header=False, index=False)

Upvotes: 10

Related Questions