Reputation: 167
I want to do an condition based comparison on the following df.
ID type state value
0 123123 1 LOW 5
2 123123 2 LOW 8
5 123123 2 HIGH 2
7 123123 1 LOW 3
1 123412 2 HIGH 21
6 123412 1 LOW 12
3 234234 2 LOW 5
8 234234 1 LOW 1
4 432424 2 HIGH 9
9 432424 1 HIGH 7
So what I want to to do is check for every row where the value in column state is LOW if the following or the value before in the row is HIGH. If so I want to check if the values in row type are the same. If again this condition is fulfilled i want to check if the difference between this values is smaller than 2. If so a new column named validate should be true if not it should be false.
So summarized I am looking for a way to efficiency check for this condition.
I thought about doing it with apply, however I am not sure how I then can check the values before and after the given row.
So far I only thought about doing it by iterating over the data frame, but that should be super slow... so if any of you has an idea how to solve this problem, I would be very grateful.
outcome should be like this:
ID type state value valid
0 123123 1 LOW 5 False
2 123123 2 LOW 4 True
5 123123 2 HIGH 2
7 123123 1 LOW 3 True
1 123412 1 HIGH 21
6 123412 1 LOW 12 False
3 234234 2 LOW 5 False
8 234234 1 LOW 1 False
4 432424 2 HIGH 9
9 432424 1 HIGH 7
Here are again the row based conditions summarised for the row to be true
Upvotes: 0
Views: 60
Reputation: 41487
Use shift()
and diff()
to create an above | below
mask for the state
, type
, and value
conditions.
below = df.state.eq('LOW') & df.state.shift(-1).eq('HIGH') & df.type.shift(-1).eq(df.type) & df.value.diff(-1).abs().le(2)
above = df.state.eq('LOW') & df.state.shift().eq('HIGH') & df.type.shift().eq(df.type) & df.value.diff().abs().le(2)
df.loc[df.state.eq('LOW'), 'valid'] = False
df.loc[above | below, 'valid'] = True
The output is different for index 7, but I don't understand why your expected output shows True
since the above type
is different (1 vs 2) and below value
difference is 18:
ID type state value valid
0 123123 1 LOW 5 False
2 123123 2 LOW 4 True
5 123123 2 HIGH 2 NaN
7 123123 1 LOW 3 False
1 123412 2 HIGH 21 NaN
6 123412 1 LOW 12 False
3 234234 2 LOW 5 False
8 234234 1 LOW 1 False
4 432424 2 HIGH 9 NaN
9 432424 1 HIGH 7 NaN
Upvotes: 1
Reputation: 14949
Try this -
df.loc[df.state == 'LOW', 'valid'] = False
df.loc[(df.state != df.state.shift(-1)) & (df.state == 'LOW'), 'valid'] = True
Upvotes: 1
Reputation: 1
You can use the df.shift method and compare your df to the shifted one. That way there is no need to loop.
Upvotes: 0