Reputation: 35
Given a dataframe df:
import pandas as pd
df = pd.DataFrame({"changes":['increase', 'constant', 'constant', 'constant', 'decline', 'constant', 'constant', 'increase', 'constant', 'constant', 'constant','decline', 'constant', 'constant', 'constant',})
output:
changes |
---|
increase |
constant |
constant |
constant |
decline |
constant |
constant |
increase |
constant |
constant |
constant |
decline |
constant |
constant |
constant |
The task is to delete the rows with decline
and the constant
that comes after it.
I do not want to remove increase
and the constant
coming after it.
The expected output in this case should look like:
changes |
---|
increase |
constant |
constant |
constant |
increase |
constant |
constant |
constant |
Upvotes: 1
Views: 74
Reputation: 2188
df = pd.DataFrame({"changes":['increase', 'constant', 'constant', 'constant', 'decline', 'constant', 'constant', 'increase', 'constant', 'constant', 'constant','decline', 'constant', 'constant', 'constant']})
### Build group
df['group'] = df['changes'].ne(df['changes'].shift()).cumsum()
df
###
changes group
0 increase 1
1 constant 2
2 constant 2
3 constant 2
4 decline 3
5 constant 4
6 constant 4
7 increase 5
8 constant 6
9 constant 6
10 constant 6
11 decline 7
12 constant 8
13 constant 8
14 constant 8
Create masks to filter out unwanted data
mask_1 = df['changes'].eq('decline') & df['changes'].shift(-1).eq('constant')
mask_2 = df['changes'].eq('constant') & df['changes'].shift().eq('decline')
groups = df.loc[mask_1 | mask_2, 'group']
groups
It indicates group
3
,4
,7
,8
should be excluded
Assign filtered data to result
result = df[~df['group'].isin(groups)].drop(columns=['group'])
result
###
changes
0 increase
1 constant
2 constant
3 constant
4 increase
5 constant
6 constant
7 constant
Upvotes: 2