Reputation: 21
I have a dataframe with three columns : Subchannel
, Campaign
and ID
Subchannel Campaign ID
PROMO FLASHSALES X123
PROMO PERCENTOFF Y123
I want to write a python code if the Subchannel starts with "PRO"
then a new column should be created in the dataframe EML_LOOKUP
with same values as Column ID
. Below is the code that I was attempting but it doesnt work. Kindly help.
if EML_LOOKUP['Subchannel'].str.startswith("PRO"):
EML_LOOKUP['New_Column'] = EML_LOOKUP['ID']
Upvotes: 1
Views: 108
Reputation: 18367
You can use np.where()
which works in a vectorized way and is faster than apply()
:
import numpy as np
EML_LOOKUP['New_Column'] = np.where(EML_LOOKUP['Subchannel'].str.startswith("PRO"),EML_LOOKUP['ID'],np.nan)
In this example, we're filling with NaN
the new column if the Subchannel doesn't start with PRO, otherwise, we're setting the ID value.
Here's a full working example:
EML_LOOKUP = pd.DataFrame({'Subchannel':['PROMO','PROMO','NOT PROMO'],
'Campaign':['FALSHSALES','PERCENTOFF','REGULAR'],
'ID':['X123','Y123','Z123']})
Subchannel Campaign ID
0 PROMO FALSHSALES X123
1 PROMO PERCENTOFF Y123
2 NOT PROMO REGULAR Z123
After applying the proposed solution:
EML_LOOKUP['New_Column'] = np.where(EML_LOOKUP['Subchannel'].str.startswith("PRO"),EML_LOOKUP['ID'],np.nan)
We get this output:
Subchannel Campaign ID New_Column
0 PROMO FALSHSALES X123 X123
1 PROMO PERCENTOFF Y123 Y123
2 NOT PROMO REGULAR Z123 NaN
In case there are multiple conditions we can use add them separated by parenthesis and joined by the boolean operator &
or |
depending on the logic we desire to use:
EML_LOOKUP = pd.DataFrame({'Subchannel':['PROMO','PROMO','NOT PROMO'],
'Campaign':['FALSHSALES','PERCENTOFF','REGULAR'],
'ID':['X123','Y123','Z123'],
'Campaign':[10,3,20]})
EML_LOOKUP['New_Column'] = np.where((EML_LOOKUP['Subchannel'].str.startswith("PRO")) &
(EML_LOOKUP['Campaign'] > 5),
EML_LOOKUP['ID'],np.nan)
Output:
Subchannel Campaign ID New_Column
0 PROMO 10 X123 X123
1 PROMO 3 Y123 NaN
2 NOT PROMO 20 Z123 NaN
Upvotes: 1
Reputation: 61
EML_LOOKUP['New_Column'] = EML_LOOKUP.loc[EML_LOOKUP['Subchannel'].str.startswith("PRO")].ID
Upvotes: 0
Reputation: 8790
Boolean indexing using your str.startswith
:
EML_LOOKUP['New_Column'] = EML_LOOKUP['ID'][EML_LOOKUP['Subchannel'].str.startswith('PRO')]
Upvotes: 0