Reputation: 97
I have the following data frames with df1 been my master
df1 = pd.DataFrame([['a', '10'], ['b', '20']],
columns=['letter', 'Jan'])
df2 = pd.DataFrame([['a', '1'], ['b', '2'], ['c', '2']],
columns=['letter', 'feb'])
df3 = pd.DataFrame([['a', '10'], ['b', '20'], ['c', '20'], ['d', '20']],
columns=['letter', 'march'])
how can I add df2 then df3 to df1 so it looks like the following.
Letter jan feb march
0 a 10 1 10
1 b 20 2 20
2 c 2 20
3 d 20
res = pd.concat([df1, df2,df3])
I tired this and the letter keeps getting added in which is not what i want
letter Jan feb march
0 a 10 NaN NaN
1 b 20 NaN NaN
0 a NaN 1 NaN
1 b NaN 2 NaN
2 c NaN 2 NaN
0 a NaN NaN 10
1 b NaN NaN 20
2 c NaN NaN 20
3 d NaN NaN 20
Any idea how i can solve this problem ? Thanks
Upvotes: 1
Views: 512
Reputation: 71689
Use DataFrame.merge
to outer merge the dataframes df1
, df2
and df3
on letter
then use DataFrame.fillna
to fill the NaN
values:
df = (
df1.merge(df2, on='letter', how='outer')
.merge(df3, on='letter', how='outer').fillna('')
)
OR, If having large number of dataframes which requires merging we can use more concise solution using reduce
+ pd.merge
and pass the list of dataframes which require merging to the reduce
function:
from functools import reduce
df = reduce(lambda x, y:
pd.merge(x, y, on='letter', how='outer'), [df1, df2, df3]).fillna('')
Result:
# print(df)
letter Jan feb march
0 a 10 1 10
1 b 20 2 20
2 c 2 20
3 d 20
Upvotes: 1
Reputation: 178
Here's the solution of your problem.
df_outer = pd.merge(df1, df2, on='letter', how='outer')
df_outer = pd.merge(df_outer, df3, on='letter', how='outer')
df_outer = df_outer.fillna('')
df_outer
Upvotes: 2