Reputation: 167

Conditionbased check based on previous and following row in df

I want to do an condition based comparison on the following df.

     ID     type  state   value
0  123123     1    LOW      5
2  123123     2    LOW      8
5  123123     2    HIGH     2
7  123123     1    LOW      3
1  123412     2    HIGH     21
6  123412     1    LOW      12
3  234234     2    LOW      5
8  234234     1    LOW      1
4  432424     2    HIGH     9
9  432424     1    HIGH     7

So what I want to to do is check for every row where the value in column state is LOW if the following or the value before in the row is HIGH. If so I want to check if the values in row type are the same. If again this condition is fulfilled i want to check if the difference between this values is smaller than 2. If so a new column named validate should be true if not it should be false.

So summarized I am looking for a way to efficiency check for this condition.

I thought about doing it with apply, however I am not sure how I then can check the values before and after the given row.

So far I only thought about doing it by iterating over the data frame, but that should be super slow... so if any of you has an idea how to solve this problem, I would be very grateful.

outcome should be like this:

     ID     type  state   value  valid
0  123123     1    LOW      5      False
2  123123     2    LOW      4      True
5  123123     2    HIGH     2      
7  123123     1    LOW      3      True
1  123412     1    HIGH     21     
6  123412     1    LOW      12     False
3  234234     2    LOW      5      False
8  234234     1    LOW      1      False
4  432424     2    HIGH     9      
9  432424     1    HIGH     7

Here are again the row based conditions summarised for the row to be true

either state before or after row is HIGH
for the row that is high (or if applicable both) then check if type is the same
lastly check if the difference between the value is smaller than 2

Upvotes: 0

Answers (3)

tdy

Reputation: 41487

Use shift() and diff() to create an above | below mask for the state, type, and value conditions.

below = df.state.eq('LOW') & df.state.shift(-1).eq('HIGH') & df.type.shift(-1).eq(df.type) & df.value.diff(-1).abs().le(2)
above = df.state.eq('LOW') & df.state.shift().eq('HIGH') & df.type.shift().eq(df.type) & df.value.diff().abs().le(2)

df.loc[df.state.eq('LOW'), 'valid'] = False
df.loc[above | below, 'valid'] = True

The output is different for index 7, but I don't understand why your expected output shows True since the above type is different (1 vs 2) and below value difference is 18:

       ID  type state  value  valid
0  123123     1   LOW      5  False
2  123123     2   LOW      4   True
5  123123     2  HIGH      2    NaN
7  123123     1   LOW      3  False
1  123412     2  HIGH     21    NaN
6  123412     1   LOW     12  False
3  234234     2   LOW      5  False
8  234234     1   LOW      1  False
4  432424     2  HIGH      9    NaN
9  432424     1  HIGH      7    NaN

Upvotes: 1

Nk03

Reputation: 14949

Try this -

df.loc[df.state == 'LOW', 'valid'] = False
df.loc[(df.state != df.state.shift(-1)) & (df.state == 'LOW'), 'valid'] = True

Upvotes: 1

kemal-nl

Reputation: 1

You can use the df.shift method and compare your df to the shifted one. That way there is no need to loop.

Upvotes: 0

Conditionbased check based on previous and following row in df

Answers (3)

Related Questions