Reputation: 25
This is somewhat basic, but I couldn't find a simple answer. In Python have a dataframe A like this:
ItemId Price
------- -------
0 a1 10.0
1 a1 15.0
2 a2 8.0
3 a3 7.0
And a second one, B, like this, where item ids appear only once, they are index:
ItemId Discount
------ ---------
a1 0.2
a2 0.5
a4 0.3
I want to subtract 'Discount' values (from B) from 'Price' of dataframe A, by matching Item Ids, like this:
ItemId Price
------- -----
a1 9.8
a1 14.8
a2 7.5
a3 7.0
How can I do this in a efficient way, taking into account that actual dataframes have thousands of rows and many other columns?
Upvotes: 0
Views: 1397
Reputation: 41337
reindex()
the discounts using the price df with fill_value=0
:
A.set_index('ItemId').Price - B.Discount.reindex(A.ItemId, fill_value=0)
# ItemId
# a1 9.8
# a1 14.8
# a2 7.5
# a3 7.0
# dtype: float64
Timings of the current answers:
map_ = lambda A, B: A.Price - A.ItemId.map(B.Discount).fillna(0)
reindex_ = lambda A, B: A.set_index('ItemId').Price - B.Discount.reindex(A.ItemId, fill_value=0)
merge_ = lambda A, B: A.merge(B, on='ItemId', how='left').eval('Price - Discount.fillna(0)')
Upvotes: 2
Reputation: 13417
You can use a merge
to align the frames on the "ItemId" column and eval
to operate on those aligned columns:
>>> df1.merge(df2, on="ItemId", how="left").eval("Price - Discount.fillna(0)")
0 9.8
1 14.8
2 7.5
dtype: float64
Upvotes: 0
Reputation: 23146
You can just use map
:
>>> dfA["Price"]-dfA["ItemId"].map(dfB["Discount"])
0 9.8
1 14.8
2 7.5
dtype: float64
Upvotes: 1