Reputation: 331
So I am parsing data contained in many files, looping through them and storing certain elements in a list, and appending each resulting list to a dataframe with Pandas using Python.
It works, except I can't figure out how to keep the header row while appending. It either disappears or is duplicated with each append.
The below code serves as an example of the latest code:
import pandas as pd
for i in range(1,4):
data = [{'name': 'Company'+str(i), 'city': 'New York'}]
stuff = []
for element in data:
stuff.append(element)
df = pd.DataFrame(columns=["name",
"city"])
for record in stuff:
df = df.append(record, ignore_index=True)
df.to_csv('test.csv', mode='a', header=False, index=False)
With this code the output (csv file) is:
Company1 New York
Company2 New York
Company3 New York
But I am looking for the output to be:
name city
Company1 New York
Company2 New York
Company3 New York
Thanks!
Upvotes: 3
Views: 9188
Reputation: 394003
But you've set header=False
:
df.to_csv('test.csv', mode='a', header=False, index=False)
you should do:
df.to_csv('test.csv', mode='a', header=True, index=False)
You just need to set it to True
for the first iteration and then False
for the subsequent iterations
Basically you just do something like the following:
# add this outside your for loop
writeHeader = True
# use the following where you write to csv
if writeHeader is True:
df.to_csv('test.csv', mode='a', header=True, index=False)
writeHeader = False
else:
df.to_csv('test.csv', mode='a', header=False, index=False)
or similar
So the complete thing looks like:
import pandas as pd
writeHeader = True
for i in range(1,4):
data = [{'name': 'Company'+str(i), 'city': 'New York'}]
stuff = []
for element in data:
stuff.append(element)
df = pd.DataFrame(columns=["name",
"city"])
for record in stuff:
df = df.append(record, ignore_index=True)
if writeHeader is True:
df.to_csv('test.csv', mode='a', header=True, index=False)
writeHeader = False
else:
df.to_csv('test.csv', mode='a', header=False, index=False)
Upvotes: 10