Reputation: 4174
I have two dataframes that I want to sum along the y
axis, conditionally.
For example:
df_1
a b value 1 1 1011 1 2 1012 2 1 1021 2 2 1022
df_2
a b value 9 9 99 1 2 12 2 1 21
I want to make df_1['value'] -= df_2['value'] if df_1[a] == df_2[a] & df_1[b] == df_2[b]
, so the output would be:
OUTPUT
a b value 1 1 1011 1 2 1000 2 1 1000 2 2 1022
Is there a way to achieve that instead of iterating the whole dataframe? (It's pretty big)
Upvotes: 0
Views: 2559
Reputation: 858
You could also perform a left join and subtract matching values. Here is how to do that:
(pd.merge(df_1, df_2, how='left', on=['a', 'b'], suffixes=('_1', '_2'))
.fillna(0)
.assign(value=lambda x: x.value_1 - x.value_2)
)[['a', 'b', 'value']]
Upvotes: 3
Reputation: 18201
You could let
merged = pd.merge(df_1, df_2, on=['a', 'b'], left_index=True)
df_1.value[merged.index] = merged.value_x - merged.value_y
Result:
In [37]: df_1
Out[37]:
a b value
0 1 1 1011
1 1 2 1000
2 2 1 1000
3 2 2 1022
Upvotes: 1
Reputation: 51165
Make use of index alignment that pandas
provides here, by setting a
and b
as your index before subtracting.
for df in [df1, df2]:
df.set_index(['a', 'b'], inplace=True)
df1.sub(df2, fill_value=0).reindex(df1.index)
value
a b
1 1 1011.0
2 1000.0
2 1 1000.0
2 1022.0
Upvotes: 5