Reputation: 5234
The python data frame I currently have contains two columns: "EVENT" and "NAME".
EVENT NAME
A DEN
B HAU
C TOT
D ABC
E DEN
I want to implement logic so that my EVENT column is as follows:
EVENT
A_DEN
B
C
D
E_DEN
I want to implement logic that says if "NAME" column contains DEN value then concatenate it to the value in "EVENT" column. Otherwise, leave value as is in "EVENT" column.
I have scoured the internet on how to do this but wasn't able to find anything specific to what I'm trying to accomplish.
Upvotes: 2
Views: 5393
Reputation: 403278
Option 1
You could do this with str.contains
/eq
to perform the "contains" check, and np.where
to conditionally build your result -
df.EVENT = np.where(df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME, df.EVENT)
Or,
df.EVENT = np.where(df.NAME.eq('DEN'), df.EVENT + '_' + df.NAME, df.EVENT)
df
EVENT NAME
0 A_DEN DEN
1 B HAU
2 C TOT
3 D ABC
4 E_DEN DEN
Don't forget to import numpy as np
.
Option 2
Another method is using pd.Series.mask
/pd.Series.where
-
df.EVENT = df.EVENT.mask(df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME)
Or,
df.EVENT = df.EVENT.where(~df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME)
df
EVENT NAME
0 A_DEN DEN
1 B HAU
2 C TOT
3 D ABC
4 E_DEN DEN
Option 3
Yet another option is using loc
based indexing with a mask -
m = df.NAME.str.contains('DEN')
df.loc[m, 'EVENT'] += ('_' + df.loc[m, 'NAME'])
df
EVENT NAME
0 A_DEN DEN
1 B HAU
2 C TOT
3 D ABC
4 E_DEN DEN
Upvotes: 11