DeleLinus
DeleLinus

Reputation: 35

A function in pandas or numpy to filter column by list of values that follows it

Given a dataframe df:

import pandas as pd
df = pd.DataFrame({"changes":['increase', 'constant', 'constant', 'constant', 'decline', 'constant', 'constant', 'increase', 'constant', 'constant', 'constant','decline', 'constant', 'constant', 'constant',})

output:

changes
increase
constant
constant
constant
decline
constant
constant
increase
constant
constant
constant
decline
constant
constant
constant

The task is to delete the rows with decline and the constant that comes after it. I do not want to remove increase and the constant coming after it.

The expected output in this case should look like:

changes
increase
constant
constant
constant
increase
constant
constant
constant

Upvotes: 1

Views: 74

Answers (1)

Baron Legendre
Baron Legendre

Reputation: 2188

df = pd.DataFrame({"changes":['increase', 'constant', 'constant', 'constant', 'decline', 'constant', 'constant', 'increase', 'constant', 'constant', 'constant','decline', 'constant', 'constant', 'constant']})

### Build group
df['group'] = df['changes'].ne(df['changes'].shift()).cumsum()
df
###
     changes  group
0   increase      1
1   constant      2
2   constant      2
3   constant      2
4    decline      3
5   constant      4
6   constant      4
7   increase      5
8   constant      6
9   constant      6
10  constant      6
11   decline      7
12  constant      8
13  constant      8
14  constant      8

Create masks to filter out unwanted data

mask_1 = df['changes'].eq('decline') & df['changes'].shift(-1).eq('constant')
mask_2 = df['changes'].eq('constant') & df['changes'].shift().eq('decline')
groups = df.loc[mask_1 | mask_2, 'group']
groups

enter image description here

It indicates group 3,4,7,8 should be excluded




Assign filtered data to result

result = df[~df['group'].isin(groups)].drop(columns=['group'])
result
###
    changes
0  increase
1  constant
2  constant
3  constant
4  increase
5  constant
6  constant
7  constant

Upvotes: 2

Related Questions