Kippi
Kippi

Reputation: 514

finding places in a pandas series where the value changes from one specific value to another

I've been trying for hours to find a solution to this problem and reaching a deadline so I'm hoping I will find salvation here: I have a series that contains several optional strings as values that represent a state of a medicine used during a hospital stay: 'virgin', 'start', 'on', 'off', 'prev' for example:

time                   value

2007-04-10 13:06:00    virgin
2007-04-10 13:07:00    virgin
2007-04-10 13:08:00     start
2007-04-10 13:09:00        on
2007-04-10 13:10:00        on
2007-04-10 13:11:00        on
2007-04-10 13:12:00        on
2007-04-10 13:13:00        on
2007-04-10 13:14:00        on
2007-04-10 13:15:00        on
2007-04-10 13:16:00        on
2007-04-10 13:17:00        on
2007-04-10 13:18:00        on
2007-04-10 13:19:00        on
2007-04-10 13:20:00        on
2007-04-10 13:21:00        on
2007-04-10 13:22:00        on
2007-04-10 13:23:00        on
2007-04-10 13:24:00        on
2007-04-10 13:25:00        on
2007-04-10 13:26:00        on
2007-04-10 13:27:00        on
2007-04-10 13:28:00        on
2007-04-10 13:29:00        on
2007-04-10 13:30:00        on
2007-04-10 13:31:00        on
2007-04-10 13:32:00        on
2007-04-10 13:33:00        off
2007-04-10 13:34:00        off
2007-04-10 13:35:00        off
                        ...  
2007-04-19 15:06:00      prev
2007-04-19 15:07:00      prev
2007-04-19 15:08:00      prev

what I need to do is to locate places where df['value] == 'on' is followed by df['value] == 'off', so that I can change the value of that place to 'stop'.

I've looked everywhere and cannot find a solution, nor have I found a question here that corresponds to this need. If you know of such a question -please - do not downvote - just point me in the right direction - I feel like I've made every effort to solve this problem before posting.

Thank you!

Upvotes: 2

Views: 102

Answers (1)

EdChum
EdChum

Reputation: 393973

IIUC then you can combine your boolean conditions and use shift to test the previous row value:

In [44]:
df['state_stop'] = (df['value']=='off') & (df['value'].shift() =='on')
df

Out[44]:
                  time   value state_stop
0  2007-04-10 13:06:00  virgin      False
1  2007-04-10 13:07:00  virgin      False
2  2007-04-10 13:08:00   start      False
3  2007-04-10 13:09:00      on      False
4  2007-04-10 13:10:00      on      False
5  2007-04-10 13:11:00      on      False
6  2007-04-10 13:12:00      on      False
7  2007-04-10 13:13:00      on      False
8  2007-04-10 13:14:00      on      False
9  2007-04-10 13:15:00      on      False
10 2007-04-10 13:16:00      on      False
11 2007-04-10 13:17:00      on      False
12 2007-04-10 13:18:00      on      False
13 2007-04-10 13:19:00      on      False
14 2007-04-10 13:20:00      on      False
15 2007-04-10 13:21:00      on      False
16 2007-04-10 13:22:00      on      False
17 2007-04-10 13:23:00      on      False
18 2007-04-10 13:24:00      on      False
19 2007-04-10 13:25:00      on      False
20 2007-04-10 13:26:00      on      False
21 2007-04-10 13:27:00      on      False
22 2007-04-10 13:28:00      on      False
23 2007-04-10 13:29:00      on      False
24 2007-04-10 13:30:00      on      False
25 2007-04-10 13:31:00      on      False
26 2007-04-10 13:32:00      on      False
27 2007-04-10 13:33:00     off       True
28 2007-04-10 13:34:00     off      False
29 2007-04-10 13:35:00     off      False
30 2007-04-19 15:06:00    prev      False
31 2007-04-19 15:07:00    prev      False
32 2007-04-19 15:08:00    prev      False

So this combines your 2 boolean conditions using & wrapped in parentheses due to operator precedence and tests the previous row using shift, as you can see entry 27 2007-04-10 13:33:00 off True is set to True as desired, you can modify the statement to this:

In [47]:
df.loc[(df['value']=='off') & (df['value'].shift() =='on'), 'value'] = 'stop'
df

Out[47]:
                  time   value
0  2007-04-10 13:06:00  virgin
1  2007-04-10 13:07:00  virgin
2  2007-04-10 13:08:00   start
3  2007-04-10 13:09:00      on
4  2007-04-10 13:10:00      on
5  2007-04-10 13:11:00      on
6  2007-04-10 13:12:00      on
7  2007-04-10 13:13:00      on
8  2007-04-10 13:14:00      on
9  2007-04-10 13:15:00      on
10 2007-04-10 13:16:00      on
11 2007-04-10 13:17:00      on
12 2007-04-10 13:18:00      on
13 2007-04-10 13:19:00      on
14 2007-04-10 13:20:00      on
15 2007-04-10 13:21:00      on
16 2007-04-10 13:22:00      on
17 2007-04-10 13:23:00      on
18 2007-04-10 13:24:00      on
19 2007-04-10 13:25:00      on
20 2007-04-10 13:26:00      on
21 2007-04-10 13:27:00      on
22 2007-04-10 13:28:00      on
23 2007-04-10 13:29:00      on
24 2007-04-10 13:30:00      on
25 2007-04-10 13:31:00      on
26 2007-04-10 13:32:00      on
27 2007-04-10 13:33:00    stop
28 2007-04-10 13:34:00     off
29 2007-04-10 13:35:00     off
30 2007-04-19 15:06:00    prev
31 2007-04-19 15:07:00    prev
32 2007-04-19 15:08:00    prev

Upvotes: 3

Related Questions