Increment rank each time flag changes

Question

I have the following pandas dataframe where the first column is the datetime index. I am trying to achieve the desired_output column which increments every time the flag changes from 0 to 1 or 1 to 0. I have been able to achieve this type of thing in SQL however after finding that pandasql sqldf for some strange reason changes the values of the field undergoing the partition I am now trying to achieve this using regular python syntax.

Any help would be much appreciated.

+-------------+------+----------------+
| date(index) | flag | desired_output |
+-------------+------+----------------+
| 1/01/2020   |    0 |              1 |
| 2/01/2020   |    0 |              1 |
| 3/01/2020   |    0 |              1 |
| 4/01/2020   |    1 |              2 |
| 5/01/2020   |    1 |              2 |
| 6/01/2020   |    0 |              3 |
| 7/01/2020   |    1 |              4 |
| 8/01/2020   |    1 |              4 |
| 9/01/2020   |    1 |              4 |
| 10/01/2020  |    1 |              4 |
| 11/01/2020  |    1 |              4 |
| 12/01/2020  |    1 |              4 |
| 13/01/2020  |    0 |              5 |
| 14/01/2020  |    0 |              5 |
| 15/01/2020  |    0 |              5 |
| 16/01/2020  |    0 |              5 |
| 17/01/2020  |    1 |              6 |
| 18/01/2020  |    0 |              7 |
| 19/01/2020  |    0 |              7 |
| 20/01/2020  |    0 |              7 |
| 21/01/2020  |    0 |              7 |
| 22/01/2020  |    1 |              8 |
| 23/01/2020  |    1 |              8 |
+-------------+------+----------------+

Henry Yik · Accepted Answer

Use diff and cumsum:

print (df["flag"].diff().ne(0).cumsum())


0     1
1     1
2     1
3     2
4     2
5     3
6     4
7     4
8     4
9     4
10    4
11    4
12    5
13    5
14    5
15    5
16    6
17    7
18    7
19    7
20    7
21    8
22    8

Increment rank each time flag changes

Answers (1)

Related Questions