Reputation: 13
I am trying to merge multiple DataFrames on same DocID then sum up the weights but when I do merge it creates Weight_x,Weight_y. This would be fine for only two DataFrames but the amount of Dataframes to merge changes based on user input so merging creates Weight_x, Weight_y multiple times. So how can I merge more than 2 DataFrames such that they are merging on DocID and Weight is Summed?
Example:
df1= DocID Weight
1 4
2 7
3 8
df2= DocID Weight
1 5
2 9
8 1
finalDf=
DocID Weight
1 9
2 16
Upvotes: 1
Views: 330
Reputation: 19545
You can merge, set the 'DocID' column as the index, then sum the remaining columns together. Then you can reindex and rename the columns in the resulting final_df as needed:
df_final = pd.merge(df1, df2, on=['DocID']).set_index(['DocID']).sum(axis=1)
df_final = pd.DataFrame({"DocID": df_final.index, "Weight":df_final}).reset_index(drop=True)
Output:
>>> df_final
DocID Weight
0 1 9
1 2 16
Upvotes: 1
Reputation: 28649
df1.set_index('DocID').add(df2.set_index('DocID')).dropna()
Weight
DocID
1 9.0
2 16.0
Upvotes: 1
Reputation: 1226
Can you try this pd.merge(df1, df2, on=['DocID']).set_index(['DocID']).sum(axis=1)
You can now give any name to the sum column.
Upvotes: 0