Tetrasodium
Tetrasodium

Reputation: 35

Grouping the same value based on a column increment using Python pandas

what I am trying to do is group values ​​of 0 in a certain period of time using a python dataframe for example I have:

| Time (seconds) | Value |
|       1        |   0   |
|       2        |   0   |
|       3        |   0   |
|       4        |   1   |
|       5        |   0   |
|       6        |   1   |
|       7        |   1   |
|       8        |   0   |
|       9        |   0   |
|       10       |   0   |
|       11       |   1   |
|       12       |   0   |
|       13       |   0   |

And the output what I'm expecting is:

| Time (seconds) | Value | Group |
|       1        |   0   |   1   |
|       2        |   0   |   1   |
|       3        |   0   |   1   |
|       4        |   1   |       |
|       5        |   0   |   2   |
|       6        |   1   |       |
|       7        |   1   |       |
|       8        |   0   |   3   |
|       9        |   0   |   3   |
|       10       |   0   |   3   |
|       11       |   1   |       |
|       12       |   0   |   4   |
|       13       |   0   |   4   |

Upvotes: 2

Views: 322

Answers (2)

mozway
mozway

Reputation: 260410

You can check when the value change using shift, cumsum, and mask:

s = df['value'].eq(0)
df['group'] = (s&s.ne(s.shift())).cumsum().where(s, 0)

Output:

    time  value  group
0      1      0      1
1      2      0      1
2      3      0      1
3      4      1      0
4      5      0      2
5      6      1      0
6      7      1      0
7      8      0      3
8      9      0      3
9     10      0      3
10    11      1      0
11    12      0      4
12    13      0      4

Upvotes: 1

BENY
BENY

Reputation: 323226

You can try cumsum then pass to factorize

s = df.Value.ne(0)

df.loc[df.index[~s],'new'] = s.cumsum()[~s].factorize()[0]+1

Upvotes: 0

Related Questions