Reputation: 1470
I'm new to pandas and I'm trying to figure this scenario out: I have a sample DataFrame with two products. df =
Product_Num Date Description Price
10 1-1-18 Fruit Snacks 2.99
10 1-2-18 Fruit Snacks 2.99
10 1-5-18 Fruit Snacks 1.99
10 1-8-18 Fruit Snacks 1.99
10 1-10-18 Fruit Snacks 2.99
45 1-1-18 Apples 2.99
45 1-3-18 Apples 2.99
45 1-5-18 Apples 2.99
45 1-9-18 Apples 1.49
45 1-10-18 Apples 1.49
45 1-13-18 Apples 1.49
45 1-15-18 Apples 2.99
I also have another small DataFrame that looks like this (which shows promotional prices of the same products): df2=
Product_Num Price
10 1.99
45 1.49
Notice that df2 does not contain columns 'Date' nor 'Description.' What I want to do is to remove all promo prices from df1 (for all dates that are on promo), using the data from df1. What is the best way to do this?
So, I want to see this:
Product_Num Date Description Price
10 1-1-18 Fruit Snacks 2.99
10 1-2-18 Fruit Snacks 2.99
10 1-10-18 Fruit Snacks 2.99
45 1-1-18 Apples 2.99
45 1-3-18 Apples 2.99
45 1-5-18 Apples 2.99
45 1-15-18 Apples 2.99
I was thinking of doing a merge on columns Price and Product_Num, then seeing what I can do from there. But I was getting confused because of the multiple dates.
Upvotes: 6
Views: 18775
Reputation: 1
cute and readable
promo_prices = df2['Price']
promo_prods = df2['Product_Num']
no_pro = df
for price, prod in zip(promo_prices, promo_prods):
no_pro = no_pro.where(df != (price or prod)).dropna()
Upvotes: 0
Reputation: 9264
You could turn df2
into a dictionary and then filter out the values in df1
df[df[df2.columns].isin(df2.to_dict('list')).sum(1) <= 1]
Yeilds
Date Description Price Product_Num
0 1-1-18 Fruit Snacks 2.99 10
1 1-2-18 Fruit Snacks 2.99 10
4 1-10-18 Fruit Snacks 2.99 10
5 1-1-18 Apples 2.99 45
6 1-3-18 Apples 2.99 45
7 1-5-18 Apples 2.99 45
11 1-15-18 Apples 2.99 45
Upvotes: 1
Reputation: 323226
isin
with &
df.loc[~((df.Product_Num.isin(df2['Product_Num']))&(df.Price.isin(df2['Price']))),:]
Out[246]:
Product_Num Date Description Price
0 10 1-1-18 FruitSnacks 2.99
1 10 1-2-18 FruitSnacks 2.99
4 10 1-10-18 FruitSnacks 2.99
5 45 1-1-18 Apples 2.99
6 45 1-3-18 Apples 2.99
7 45 1-5-18 Apples 2.99
11 45 1-15-18 Apples 2.99
Update
df.loc[~df.index.isin(df.merge(df2.assign(a='key'),how='left').dropna().index)]
Out[260]:
Product_Num Date Description Price
0 10 1-1-18 FruitSnacks 2.99
1 10 1-2-18 FruitSnacks 2.99
4 10 1-10-18 FruitSnacks 2.99
5 45 1-1-18 Apples 2.99
6 45 1-3-18 Apples 2.99
7 45 1-5-18 Apples 2.99
11 45 1-15-18 Apples 2.99
Upvotes: 9
Reputation: 1368
With Product_Num
as index for both Dataframe, you can drop index from df1
for df2
, then concatenate the dataframes :
import pandas as pd
df1 = pd.DataFrame({'Product_Num':[1,2,3,4], 'Date': ['01/01/2012','01/02/2013','02/03/2013','04/02/2013'], 'Price': [10,10,10,10]})
df1 = df1.set_index('Product_Num')
df2 = pd.DataFrame({'Product_Num':[2], 'Date':['03/3/2012'], 'Price': [5]})
df2 = df2.set_index('Product_Num')
Drop and concatenate:
df_new = df1.drop(df2.index)
df_new = pd.concat([df_new, df2])
Result:
Date Price
Product_Num
1 01/01/2012 10
3 02/03/2013 10
4 04/02/2013 10
2 03/3/2012 5
Upvotes: 2