Reputation: 191
I need to change values of a dataset if the value Positive appears more than 2 times in a row (to Negative) in my pandas dataframe, and i need to set id by id too, if it's a different id. If Negative breaks the more than 2 in a row cycle, or Negative appears more than 2 times in a row, nothing is done.
Example:
id status
0 3 Positive
1 3 Positive
2 3 Positive
3 2 Positive
4 1 Positive
5 2 Positive
6 2 Positive
7 2 Positive
The resulting df should be:
id status
0 3 Positive
1 3 Positive
2 3 Negative
3 2 Positive
4 1 Positive
5 2 Positive
6 2 Negative
7 2 Negative
Upvotes: 0
Views: 95
Reputation: 150735
We can use groupby().cumcount()
to count the occurences of id
, then np.where
:
mask = (df['status'].eq('Positive') # check for positive
.groupby(df['id']) # group by id
.transform(lambda x:x.rolling(3).sum()) # count the consecutive positive in the last 3
.eq(3)
)
df.loc[mask, 'status'] = 'Negative'
Output:
id status
0 3 Positive
1 3 Positive
2 3 Negative
3 2 Positive
4 1 Positive
5 2 Positive
6 2 Negative
7 2 Negative
Upvotes: 1