Reputation: 53
I have a following dataframe:
I would like to group by id and add a flag column which contains Y if anytime Y has occurred against id, resultant DF would like following:
Here is my approach which is too time consuming and not sure of correctness:
temp=pd.DataFrame()
j='flag'
for i in df['id'].unique():
test=df[df['id']==i]
test[j]=np.where(np.any((test[j]=='Y')),'Y',test[j])
temp=temp.append(test)
Upvotes: 2
Views: 58
Reputation:
Compare flag
to Y
, group by id
, and use any
:
new_df = (df['flag'] == 'Y').groupby(df['id']).any().map({True:'Y', False:'N'}).reset_index()
Output:
>>> new_df
id flag
0 1 Y
1 2 Y
2 3 N
3 4 N
4 5 Y
Upvotes: 1
Reputation: 71689
You can do groupby + max
since Y > N
:
df.groupby('id', as_index=False)['flag'].max()
id flag
0 1 Y
1 2 Y
2 3 N
3 4 N
4 5 Y
Upvotes: 3