Xiaowan Wen
Xiaowan Wen

Reputation: 145

How to use IF statements to categorize with multiple conditions with Pandas

I have a categorization problem. The categorizing rule is:

If

  1. Storage Condition == 'refrigerate' and
  2. 100 < profit Per Unit < 150 and
  3. Inventory Qty <20

is given, restock Action = 'Hold Current stock level'

else restock Action = 'On Sale'.

Here's the dataset I need to run the rules on:

ID,      Fruit, Stroage Condition, Profit Per Unit, In Season or Not, Inventory Qty, Restock Action
1,       Apple,  room temperature,              20,              Yes,           200,
2,      Banana,  room temperature,              65,              Yes,            30,
3,        Pear,       refrigerate,              60,              Yes,           180,
4,  Strawberry,       refrigerate,             185,               No,            70,
5,  Watermelon,  room temperature,               8,               No,            90,
6,       Mango,             Other,              20,               No,           100,
7, DragonFruit,             Other,              65,               No,           105,

Code I have tried:

for i in range(len(df['ID'])):
    if df['Storage Condition'][i] == 'refrigerate' and df['Profit Per Unit'][i] >100 and df['Profit Per Unit'][i] <150 and df['Inventory Qty'][i]  <20:
        df['restock action'] = 'Hold Current stock level'

But I got this error message:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can any one please help?

Upvotes: 0

Views: 1375

Answers (2)

Michael Gardner
Michael Gardner

Reputation: 1813

If you are not concerned with performance there is a convenience method called apply that can be used. It can take your function and apply it to either rows or columns in your dataframe.

It's good to know how it works and the downsides of using it if you plan on learning more about the pandas library.

When should I ever want to use pandas apply() in my code?

def func(df):
    if df['Stroage Condition'] == 'refrigerate' and 100 < df['Profit Per Unit'] < 150 and df['Inventory Qty'] < 20:
        return 'Hold Current stock level'
    else:
        return 'On Sale'

df['Restock Action'] = df.apply(func, axis='columns')

Upvotes: 0

ansev
ansev

Reputation: 30930

Use np.where:

c1=df['Stroage Condition'].eq('refrigerate')
c2=df['Profit Per Unit'].between(100,150)
c3=df['Inventory Qty']<20
df['Restock Action']=np.where(c1&c2&c3,'Hold Current stock level','On Sale')
print(df)

   ID        Fruit Stroage Condition  Profit Per Unit   In Season or Not  Inventory Qty  \
0   1        Apple  room temperature               20                Yes           200   
1   2       Banana  room temperature               65                Yes            30   
2   3         Pear       refrigerate               60                Yes           180   
3   4   Strawberry       refrigerate              185                 No            70   
4   5   Watermelon  room temperature                8                 No            90   
5   6        Mango             Other               20                 No           100   
6   7  DragonFruit             Other               65                 No           105   

  Restock Action  
0        On Sale  
1        On Sale  
2        On Sale  
3        On Sale  
4        On Sale  
5        On Sale  
6        On Sale  

In this case, no row verifies the 3 conditions, so for all rows the result is On Sale

Upvotes: 2

Related Questions