appleLover
appleLover

Reputation: 15691

Pandas DataFrame to CSV

I want to append pandas data frames to the ends of CSV files. The tricky part is when I append rows, some of the times the columns may be different. I want code like this

a = pd.DataFrame([[1, 2]], columns= ["one", "two"])
with open("learn.csv", "w") as f:
    a.to_csv(f, header=True)

a = pd.DataFrame([[1, 2]], columns= ["one", "three"])
with open("learn.csv", "a") as f:
    a.to_csv(f)

to produce a CSV file that looks like this:

one, two, three
1, 2, None
1, None, 2

Upvotes: 2

Views: 6442

Answers (2)

appleLover
appleLover

Reputation: 15691

Here is the answer I came up with using alko's post and the comment from above. "a" is the dataframe:

if not os.path.isfile("learn.csv"):
    with open("learn.csv", "w") as f:
        a.to_csv(f, header=True, index=False)
else:
    reader = csv.reader(open("learn.csv"))
    csv_col = set(reader.next())
    games_col = set(list(a.columns))
    if csv_col.issuperset(games_col):
        with open("learn.csv", "a") as f:
            a.to_csv(f, header=False, index=False)
    else:
        old_entries = pd.read_csv('learn.csv')
        all_entries = pd.concat([old_entries, a])
        with open("learn.csv", "w") as f:
            all_entries.to_csv(f, header=True, index=False)

Upvotes: 0

alko
alko

Reputation: 48317

You have to concatenate your dataframes prior to saving to csv, as you have to know all the resulting columns to be able to save data properly, which is unknown to each dataframe alone. Following will do:

>>> from StringIO import StringIO
>>> buf = StringIO()
>>> a = pd.DataFrame([[1, 2]], columns= ["one", "two"])
>>> b = pd.DataFrame([[1, 2]], columns= ["one", "three"])
>>> pd.concat([a, b]).to_csv(buf, index=None, na_rep='None')
>>> print buf.getvalue()
one,three,two
1,None,2.0
1,2.0,None

Upvotes: 7

Related Questions