Reputation: 15691
I want to append pandas data frames to the ends of CSV files. The tricky part is when I append rows, some of the times the columns may be different. I want code like this
a = pd.DataFrame([[1, 2]], columns= ["one", "two"])
with open("learn.csv", "w") as f:
a.to_csv(f, header=True)
a = pd.DataFrame([[1, 2]], columns= ["one", "three"])
with open("learn.csv", "a") as f:
a.to_csv(f)
to produce a CSV file that looks like this:
one, two, three
1, 2, None
1, None, 2
Upvotes: 2
Views: 6442
Reputation: 15691
Here is the answer I came up with using alko's post and the comment from above. "a" is the dataframe:
if not os.path.isfile("learn.csv"):
with open("learn.csv", "w") as f:
a.to_csv(f, header=True, index=False)
else:
reader = csv.reader(open("learn.csv"))
csv_col = set(reader.next())
games_col = set(list(a.columns))
if csv_col.issuperset(games_col):
with open("learn.csv", "a") as f:
a.to_csv(f, header=False, index=False)
else:
old_entries = pd.read_csv('learn.csv')
all_entries = pd.concat([old_entries, a])
with open("learn.csv", "w") as f:
all_entries.to_csv(f, header=True, index=False)
Upvotes: 0
Reputation: 48317
You have to concatenate your dataframes prior to saving to csv, as you have to know all the resulting columns to be able to save data properly, which is unknown to each dataframe alone. Following will do:
>>> from StringIO import StringIO
>>> buf = StringIO()
>>> a = pd.DataFrame([[1, 2]], columns= ["one", "two"])
>>> b = pd.DataFrame([[1, 2]], columns= ["one", "three"])
>>> pd.concat([a, b]).to_csv(buf, index=None, na_rep='None')
>>> print buf.getvalue()
one,three,two
1,None,2.0
1,2.0,None
Upvotes: 7