Newbie123
Newbie123

Reputation: 21

Applying IF Condition on a panda dataframe

I have a dataframe with three columns : Subchannel, Campaign and ID

Subchannel     Campaign       ID
PROMO          FLASHSALES     X123
PROMO          PERCENTOFF     Y123

I want to write a python code if the Subchannel starts with "PRO" then a new column should be created in the dataframe EML_LOOKUP with same values as Column ID. Below is the code that I was attempting but it doesnt work. Kindly help.

if  EML_LOOKUP['Subchannel'].str.startswith("PRO"):
   EML_LOOKUP['New_Column'] = EML_LOOKUP['ID']

Upvotes: 1

Views: 108

Answers (3)

Celius Stingher
Celius Stingher

Reputation: 18367

You can use np.where() which works in a vectorized way and is faster than apply():

import numpy as np 
EML_LOOKUP['New_Column'] = np.where(EML_LOOKUP['Subchannel'].str.startswith("PRO"),EML_LOOKUP['ID'],np.nan)

In this example, we're filling with NaN the new column if the Subchannel doesn't start with PRO, otherwise, we're setting the ID value.

Here's a full working example:

EML_LOOKUP = pd.DataFrame({'Subchannel':['PROMO','PROMO','NOT PROMO'],
                   'Campaign':['FALSHSALES','PERCENTOFF','REGULAR'],
                   'ID':['X123','Y123','Z123']})

  Subchannel    Campaign    ID
0      PROMO  FALSHSALES  X123
1      PROMO  PERCENTOFF  Y123
2  NOT PROMO     REGULAR  Z123

After applying the proposed solution:

EML_LOOKUP['New_Column'] = np.where(EML_LOOKUP['Subchannel'].str.startswith("PRO"),EML_LOOKUP['ID'],np.nan)

We get this output:

  Subchannel    Campaign    ID New_Column
0      PROMO  FALSHSALES  X123       X123
1      PROMO  PERCENTOFF  Y123       Y123
2  NOT PROMO     REGULAR  Z123        NaN

In case there are multiple conditions we can use add them separated by parenthesis and joined by the boolean operator & or | depending on the logic we desire to use:

EML_LOOKUP = pd.DataFrame({'Subchannel':['PROMO','PROMO','NOT PROMO'],
                       'Campaign':['FALSHSALES','PERCENTOFF','REGULAR'],
                       'ID':['X123','Y123','Z123'],
                       'Campaign':[10,3,20]})
EML_LOOKUP['New_Column'] = np.where((EML_LOOKUP['Subchannel'].str.startswith("PRO")) & 
                                    (EML_LOOKUP['Campaign'] > 5),
                                    EML_LOOKUP['ID'],np.nan)

Output:

  Subchannel  Campaign    ID New_Column
0      PROMO        10  X123       X123
1      PROMO         3  Y123        NaN
2  NOT PROMO        20  Z123        NaN

Upvotes: 1

Vinod Karantothu
Vinod Karantothu

Reputation: 61

EML_LOOKUP['New_Column'] = EML_LOOKUP.loc[EML_LOOKUP['Subchannel'].str.startswith("PRO")].ID

Upvotes: 0

Tom
Tom

Reputation: 8790

Boolean indexing using your str.startswith:

EML_LOOKUP['New_Column'] = EML_LOOKUP['ID'][EML_LOOKUP['Subchannel'].str.startswith('PRO')]

Upvotes: 0

Related Questions