Reputation: 10043
I have a directory with csv
files:
frames/df1.csv
df2.csv
frames are structured like this:
df1.csv
artist track plays
1 Pearl Jam Jeremy 456
2 The Rolling Stones Heart of Stone 546
df2.csv
artist track likes
3 Pearl Jam Jeremy 5673
9 The Rolling Stones Heart of Stone 3456
and I would like to merge all frames into one, ending up with:
artist track plays likes
0 Pearl Jam Jeremy 456 5673
1 The Rolling Stones Heart of Stone 546 3456
I've tried:
path = 'frames'
all_files = glob.glob(path + "/*.csv")
list_ = []
for file_ in all_files:
df = pd.read_csv(file_,index_col=None, header=0)
list_.append(df)
frame = pd.concat(list_)
to no avail. what is the best way to approach this?
Upvotes: 1
Views: 1727
Reputation: 323396
I just simply using your code create the list of DataFrame
path = 'frames'
all_files = glob.glob(path + "/*.csv")
l= []
for file_ in all_files:
df = pd.read_csv(file_,index_col=None, header=0)
l.append(df)
Then using functools.reduce
, merge the list dataframe into one
import functools
l= [df1, df2, df3....]
merged_df = functools.reduce(lambda left,right: pd.merge(left,right,on=['artist','track']), l)
Upvotes: 2
Reputation: 2246
DataFrame.join
is useful. Its analogous to a SQL join. Something like:
df1.join(df2, on=('artist', 'track'))
Upvotes: 0