Reputation: 553
I need to merge a number of dataframes using pandas with the following properties:
name
, type
, and value
.name
and type
, in order to identify matchesvalue
in my example above, should result in one column after the merge. If both datasets had a row with the same name
and type
, then the result of value
should be the sum of the values of each dataset.For example, If I had the DataFrame df1
:
df1:
name type value
0 foo A 11
1 bar B 12
and the DataFrame df2
:
df2:
name type value
0 foo A 21
1 bar C 22
2 baz C 23
Then I want the result to be
result_want:
name type value
0 foo A 32
1 bar B 12
2 bar C 22
3 baz C 23
the best I found so far is how to do the merge, like this:
import pandas as pd
df1 = pd.DataFrame({
"name": ["foo", "bar"],
"type": ["A", "B"],
"value": [11, 12]
})
df2 = pd.DataFrame({
"name": ["foo", "bar", "baz"],
"type": ["A", "C", "C"],
"value": [21, 22, 23]
})
result = pd.merge(df1, df2, how='outer', on=['name', 'type'])
print(result)
gives
name type value_x value_y
0 foo A 11.0 21.0
1 bar B 12.0 NaN
2 bar C NaN 22.0
3 baz C NaN 23.0
Upvotes: 1
Views: 1472
Reputation: 20689
You can use pd.concat
then GroupBy.sum
here
out = pd.concat([df1, df2])
out.groupby(['name', 'type'], as_index=False).sum()
name type value
0 bar B 12
1 bar C 22
2 baz C 23
3 foo A 32
Upvotes: 1