Reputation: 83
i want a solution where i have to check if values of a dataframe column has specific code and if it is so i will add a predefined prefix to it. to make it easier please look into the folowing example. For say i have a Dataframe like below.
PRODUCT_KEY
EXI-CD_5S-WW5678
EX-PWN-PRO-193
EX-NIS-NS-HZ049
EX-NCI-DSI-A-R
EX-GCA-FAC-F
AU-345654
NL-108793
HK-678903
WW-564312
DE-123243
So for this column what i want to do is ; i want to check if it starts with 'AU' or 'NL' or 'HK' or 'ww' or 'DE' and if so then i will add a prefix 'GST-YIP' and for rest it will be same. the resultant dataframe will look like below.
PRODUCT_KEY
EXI-CD_5S-WW5678
EX-PWN-PRO-193
EX-NIS-NS-HZ049
EX-NCI-DSI-A-R
EX-GCA-FAC-F
GST-YIP-AU-345654
GST-YIP-NL-108793
GST-YIP-HK-678903
GST-YIP-WW-564312
GST-YIP-DE-123243
i am trying to do it within a for loop.
for i in df['PRODUCT_KEY']:
if i.str.contains('AU'|'HK'|'WW'|'DE'):
df['PRODUCT_KEY] = 'INC-AAB' + i
else:
f['PRODUCT_KEY] = i
But this is not giving me correct result.
Upvotes: 0
Views: 387
Reputation: 82785
Using np.where
with .str.startswith
Ex:
import numpy as np
df["New"] = np.where(df["PRODUCT_KEY"].str.startswith(('AU', 'NL','HK','WW')), "GST-YIP-"+df["PRODUCT_KEY"], df["PRODUCT_KEY"])
print(df)
Output:
PRODUCT_KEY New
0 EXI-CD_5S-WW5678 EXI-CD_5S-WW5678
1 EX-PWN-PRO-193 EX-PWN-PRO-193
2 EX-NIS-NS-HZ049 EX-NIS-NS-HZ049
3 EX-NCI-DSI-A-R EX-NCI-DSI-A-R
4 EX-GCA-FAC-F EX-GCA-FAC-F
5 AU-345654 GST-YIP-AU-345654
6 NL-108793 GST-YIP-NL-108793
7 HK-678903 GST-YIP-HK-678903
8 WW-564312 GST-YIP-WW-564312
9 DE-123243 DE-123243
Using Regex
Ex:
import re
ptrn = re.compile(r"^("+ "|".join(('AU', 'NL','HK','ww')) + r")", flags=re.I)
df["New"] = df["PRODUCT_KEY"].str.replace(ptrn, "GST-YIP-"+ r"\1")
Upvotes: 1