Reputation: 536
I have a list of files, that I want to put into a massive dataframe so I can run easy queries on it like figuring out the average/mean of a column etc. I have this bit of code:
def read_files():
path = 'data'
files = glob.glob(os.path.join(path, "*.csv"))
df_list = [pd.read_csv(file) for file in files]
df = pd.concat(df_list)
print(df.to_string())
return df
but this gives all of my data in one column it seems. when I try to access certain columns using df['x'] I receive key errors. How can I keep my csv structure when concating? All files will have the same columns and if they don't have the same columns I don't want to read them if that matters.
Upvotes: 0
Views: 65
Reputation: 3939
Change the line
df = pd.concat(df_list)
to
df = df_list[0]
for df_tmp in df_list[1:]:
df = df_tmp.combine_first(df)
Upvotes: 1