Reputation: 493
I have two data sets
df1 = pd.DataFrame ({"skuid" :("A","B","C","D"), "price": (0,0,0,0)})
df2 = pd.DataFrame ({"skuid" :("A","B","C","D"),"salesprice" :(10,0,0,30),"regularprice" : (9,10,0,2)})
I want to insert sales price and regular price in price with conditions: If df1 skuid and df2 skuid matches and df2 salesprice is not zero, use salesprice as price value. if sku's match and df2 salesprice is zero, use regularprice. if not use zero as price value.
def pric(df1,df2):
if (df1['skuid'] == df2['skuid'] and salesprice !=0):
price = salesprice
elif (df1['skuid'] == df2['skuid'] and regularprice !=0):
price = regularprice
else:
price = 0
I made a function with similar conditions but its not working. the result should look like in df1
skuid price
A 10
B 10
C 0
D 30
Thanks.
Upvotes: 1
Views: 92
Reputation: 364
So there are a number of issues with the function given above. Here are a few in no particular order:
Here is a version of your function which was more or less minimally change to fix the above specific issues
import pandas as pd
df1 = pd.DataFrame({"skuid" :("A","B","C","D"), "price": (0,0,0,0)})
df2 = pd.DataFrame({"skuid" :("A","B","C","D"),"salesprice" :(10,0,0,30),"regularprice" : (9,10,0,2)})
def pric(df1, df2, id_colname,df1_price_colname, df2_salesprice_colname,df2_regularprice_colname):
for i in range(df1.shape[0]):
for j in range(df2.shape[0]):
if (df1.loc[df1.index[i],id_colname] == df2.loc[df2.index[j],id_colname] and df2.loc[df2.index[j],df2_salesprice_colname] != 0):
df1.loc[df1.index[i],df1_price_colname] = df2.loc[df2.index[j],df2_salesprice_colname]
break
elif (df1.loc[df1.index[i],id_colname] == df2.loc[df2.index[j],id_colname] and df2.loc[df2.index[j],df2_regularprice_colname] != 0):
df1.loc[df1.index[i],df1_price_colname] = df2.loc[df2.index[j],df2_regularprice_colname]
break
return df1
for which entering
df1_imputed=pric(df1,df2,'skuid','price','salesprice','regularprice')
print(df1_imputed['price'])
gives
0 10
1 10
2 0
3 30
Name: price, dtype: int64
Notice how the function loops through row indices before checking equality conditions on specific elements given by a row-index / column pair.
A few things to consider:
Upvotes: 1