oss
oss

Reputation: 11

combining multiple files into a single file with DataFrame

I have been able to generate several CSV files through an API. Now I am trying to combine all CSV's into a unique Master file so that I can then work on it. But it does not work. Below code is what I have attempted What am I doing wrong?

import glob
import pandas as pd
from pandas import read_csv

master_df = pd.DataFrame()

for file in files:
    df = read_csv(file)
    master_df = pd.concat([master_df, df])
    del df

master_df.to_csv("./master_df.csv", index=False)

Upvotes: 0

Views: 495

Answers (1)

sebbit
sebbit

Reputation: 121

Although it is hard to tell what the precise problem is without more information (i.e., error message, pandas version), I believe it is that in the first iteration, master_df and df do not have the same columns. master_df is an empty DataFrame, whereas df has whatever columns are in your CSV. If this is indeed the problem, then I'd suggest storing all your data-frames (each of which represents one CSV file) in a single list, and then concatenating all of them. Like so:

import pandas as pd

df_list = [pd.read_csv(file) for file in files]

pd.concat(df_list, sort=False).to_csv("./master_df.csv", index=False)

Don't have time to find/generate a set of CSV files and test this right now, but am fairly sure this should do the job (assuming pandas version 0.23 or compatible).

Upvotes: 1

Related Questions