Reputation:
This question seems repetition and answered before but it is a bit tricky.
Let us say I have the following data frame.
Id Col_1
1 aaa
1 ccc
2 bbb
3 aa
Based on the value column Id and Col_1 I want create new column and assign new value by checking the existence of aa
in Col_1. And this value should be applied based on the Id
means if they have same Id.
The expected result:
Id Col_1 New_Column
1 aaa aa
1 ccc aa
2 bbb
3 aa aa
I tried it with this:
df['New_Column'] = ((df['Id']==1) | df['Col_1'].str.contains('aa')).map({True:'aa', False:''})
and the result is
Id Col_1 New_Column
1 aaa aa
1 ccc
2 bbb
3 aa aa
But as I mentioned it above, I want to assign aa
on the new column with the same Id as well.
Can anyone help on this?
Upvotes: 1
Views: 43
Reputation: 862406
Use GroupBy.transform
with GroupBy.any
for get mask for all groups with at least one aaa
:
mask = df['Col_1'].str.contains('aa').groupby(df['Id']).transform('any')
Alternative with Series.isin
and filtering Id
values by aa
:
mask = df['Id'].isin(df.loc[df['Col_1'].str.contains('aa'), 'Id'])
df['New_Column'] = np.where(mask, 'aa','')
print (df)
Id Col_1 New_Column
0 1 aaa aa
1 1 ccc aa
2 2 bbb
3 3 aa aa
EDIT:
mask1 = df['Id'].isin(df.loc[df['Col_1'].str.contains('aa'), 'Id'])
mask2 = df['Id'].isin(df.loc[df['Col_1'].str.contains('bb'), 'Id'])
df['New_Column'] = np.select([mask1, mask2], ['aa','bb'],'')
print (df)
Id Col_1 New_Column
0 1 aaa aa
1 1 ccc aa
2 2 bbb bb
3 3 aa aa
Upvotes: 2