How can I concat and still keep columns?

Question

I have a list of files, that I want to put into a massive dataframe so I can run easy queries on it like figuring out the average/mean of a column etc. I have this bit of code:

def read_files():
    path = 'data'
    files = glob.glob(os.path.join(path, "*.csv"))
    df_list = [pd.read_csv(file) for file in files]
    df = pd.concat(df_list)
    print(df.to_string())
    return df

but this gives all of my data in one column it seems. when I try to access certain columns using df['x'] I receive key errors. How can I keep my csv structure when concating? All files will have the same columns and if they don't have the same columns I don't want to read them if that matters.

above_c_level · Accepted Answer

Change the line

df = pd.concat(df_list)

to

df = df_list[0]
for df_tmp in df_list[1:]:    
    df = df_tmp.combine_first(df)

How can I concat and still keep columns?

Answers (1)

Related Questions