Reputation: 7644
i have a table in pandas df.
id prod1 prod2 count
1 10 30 100
2 10 20 200
3 20 10 200
4 30 10 100
5 30 40 300
also i am having another table in df2
product price master_product
1000 1 10
5000 2 10
2000 2 20
9000 5 20
8000 1 20
30 3 0
4000 4 50
Check if prod1 and prod2 belongs to the values in master_product,
if yes i want to replace prod1 and prod2 in my first df with the cheapest product in my master_product.
if the prod1 and prod2 values donot match with the values in master_product, leave the values as it is.
i am looking for the final table as.
id prod1 prod2 count
1 1000 4000 100
2 1000 8000 200
3 8000 1000 200
4 30 1000 100 #since 30 is not in master_product,leave as it is
5 30 40 300
i was trying to use .map function to achieve this but i could only reach to this.
df['prod1'] = df['prod1'].map(df2.set_index('master_product')['product'])
df['prod2'] = df['prod2'].map(df2.set_index('master_product')['product'])
but it will try to replace every values in prod1 and prod2 with matching values in master_product from df2.
Any ideas how to achieve this?
Upvotes: 2
Views: 330
Reputation: 863301
You can first modify df1
for minimal price
by master_product
by groupby
with idxmin
- get all indices with minimal price
:
df1 = df1.loc[df1.groupby('master_product')['price'].idxmin()]
print (df1)
product price master_product
5 30 3 0
0 1000 1 10
4 8000 1 20
6 4000 4 50
Create dict
for mapping:
d = df1.set_index('master_product')['product'].to_dict()
print (d)
{0: 30, 10: 1000, 20: 8000, 50: 4000}
Last map
and if value is missing add it by combine_first
:
df.prod1 = df.prod1.map(d).combine_first(df.prod1)
df.prod2 = df.prod2.map(d).combine_first(df.prod2)
print (df)
id prod1 prod2 count
0 1 1000.0 30.0 100
1 2 1000.0 8000.0 200
2 3 8000.0 1000.0 200
3 4 30.0 1000.0 100
4 5 30.0 40.0 300
Upvotes: 2