sayan nandi
sayan nandi

Reputation: 83

Checking if some particular string exists in a column of a dataframe or not ; if exists then add a prefix to it

i want a solution where i have to check if values of a dataframe column has specific code and if it is so i will add a predefined prefix to it. to make it easier please look into the folowing example. For say i have a Dataframe like below.

PRODUCT_KEY
EXI-CD_5S-WW5678
EX-PWN-PRO-193
EX-NIS-NS-HZ049
EX-NCI-DSI-A-R
EX-GCA-FAC-F
AU-345654
NL-108793
HK-678903
WW-564312
DE-123243

So for this column what i want to do is ; i want to check if it starts with 'AU' or 'NL' or 'HK' or 'ww' or 'DE' and if so then i will add a prefix 'GST-YIP' and for rest it will be same. the resultant dataframe will look like below.

PRODUCT_KEY
EXI-CD_5S-WW5678
EX-PWN-PRO-193
EX-NIS-NS-HZ049
EX-NCI-DSI-A-R
EX-GCA-FAC-F
GST-YIP-AU-345654
GST-YIP-NL-108793
GST-YIP-HK-678903
GST-YIP-WW-564312
GST-YIP-DE-123243

i am trying to do it within a for loop.

for i in df['PRODUCT_KEY']:
       if i.str.contains('AU'|'HK'|'WW'|'DE'):
          df['PRODUCT_KEY] = 'INC-AAB' + i
       else:
          f['PRODUCT_KEY] =  i

But this is not giving me correct result.

Upvotes: 0

Views: 387

Answers (1)

Rakesh
Rakesh

Reputation: 82785

Using np.where with .str.startswith

Ex:

import numpy as np

df["New"] = np.where(df["PRODUCT_KEY"].str.startswith(('AU', 'NL','HK','WW')), "GST-YIP-"+df["PRODUCT_KEY"], df["PRODUCT_KEY"])
print(df)

Output:

        PRODUCT_KEY                 New
0  EXI-CD_5S-WW5678    EXI-CD_5S-WW5678 
1    EX-PWN-PRO-193      EX-PWN-PRO-193 
2   EX-NIS-NS-HZ049     EX-NIS-NS-HZ049 
3    EX-NCI-DSI-A-R      EX-NCI-DSI-A-R 
4      EX-GCA-FAC-F        EX-GCA-FAC-F 
5         AU-345654   GST-YIP-AU-345654 
6         NL-108793   GST-YIP-NL-108793 
7         HK-678903   GST-YIP-HK-678903 
8         WW-564312   GST-YIP-WW-564312 
9         DE-123243           DE-123243

Using Regex

Ex:

import re
ptrn = re.compile(r"^("+ "|".join(('AU', 'NL','HK','ww')) + r")", flags=re.I)
df["New"] = df["PRODUCT_KEY"].str.replace(ptrn, "GST-YIP-"+ r"\1") 

Upvotes: 1

Related Questions