Reputation: 438
I have the following pandas dataframe where the first column is the datetime
index. I am trying to achieve the desired_output
column which increments every time the flag changes from 0 to 1 or 1 to 0. I have been able to achieve this type of thing in SQL however after finding that pandasql sqldf
for some strange reason changes the values of the field undergoing the partition
I am now trying to achieve this using regular python syntax.
Any help would be much appreciated.
+-------------+------+----------------+
| date(index) | flag | desired_output |
+-------------+------+----------------+
| 1/01/2020 | 0 | 1 |
| 2/01/2020 | 0 | 1 |
| 3/01/2020 | 0 | 1 |
| 4/01/2020 | 1 | 2 |
| 5/01/2020 | 1 | 2 |
| 6/01/2020 | 0 | 3 |
| 7/01/2020 | 1 | 4 |
| 8/01/2020 | 1 | 4 |
| 9/01/2020 | 1 | 4 |
| 10/01/2020 | 1 | 4 |
| 11/01/2020 | 1 | 4 |
| 12/01/2020 | 1 | 4 |
| 13/01/2020 | 0 | 5 |
| 14/01/2020 | 0 | 5 |
| 15/01/2020 | 0 | 5 |
| 16/01/2020 | 0 | 5 |
| 17/01/2020 | 1 | 6 |
| 18/01/2020 | 0 | 7 |
| 19/01/2020 | 0 | 7 |
| 20/01/2020 | 0 | 7 |
| 21/01/2020 | 0 | 7 |
| 22/01/2020 | 1 | 8 |
| 23/01/2020 | 1 | 8 |
+-------------+------+----------------+
Upvotes: 0
Views: 176
Reputation: 22503
Use diff
and cumsum
:
print (df["flag"].diff().ne(0).cumsum())
0 1
1 1
2 1
3 2
4 2
5 3
6 4
7 4
8 4
9 4
10 4
11 4
12 5
13 5
14 5
15 5
16 6
17 7
18 7
19 7
20 7
21 8
22 8
Upvotes: 2