PineNuts0
PineNuts0

Reputation: 5234

Conditional Concatenation Based on String Value in Column

The python data frame I currently have contains two columns: "EVENT" and "NAME".

EVENT   NAME
A       DEN
B       HAU
C       TOT
D       ABC
E       DEN

I want to implement logic so that my EVENT column is as follows:

EVENT
A_DEN
B
C
D
E_DEN

I want to implement logic that says if "NAME" column contains DEN value then concatenate it to the value in "EVENT" column. Otherwise, leave value as is in "EVENT" column.

I have scoured the internet on how to do this but wasn't able to find anything specific to what I'm trying to accomplish.

Upvotes: 2

Views: 5393

Answers (1)

cs95
cs95

Reputation: 403278

Option 1
You could do this with str.contains/eq to perform the "contains" check, and np.where to conditionally build your result -

df.EVENT = np.where(df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME, df.EVENT)

Or,

df.EVENT = np.where(df.NAME.eq('DEN'), df.EVENT + '_' + df.NAME, df.EVENT)

df

   EVENT NAME
0  A_DEN  DEN
1      B  HAU
2      C  TOT
3      D  ABC
4  E_DEN  DEN

Don't forget to import numpy as np.


Option 2
Another method is using pd.Series.mask/pd.Series.where -

df.EVENT = df.EVENT.mask(df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME)

Or,

df.EVENT = df.EVENT.where(~df.NAME.str.contains('DEN'), df.EVENT + '_' + df.NAME)

df

   EVENT NAME
0  A_DEN  DEN
1      B  HAU
2      C  TOT
3      D  ABC
4  E_DEN  DEN

Option 3
Yet another option is using loc based indexing with a mask -

m = df.NAME.str.contains('DEN')
df.loc[m, 'EVENT'] += ('_' + df.loc[m, 'NAME'])

df
   EVENT NAME
0  A_DEN  DEN
1      B  HAU
2      C  TOT
3      D  ABC
4  E_DEN  DEN

Upvotes: 11

Related Questions