Chris C
Chris C

Reputation: 619

Pandas - Row number since last greater than 0 value

Let's say I have a Pandas series like so:

import pandas as pd

pd.Series([1, 0, 0, 1, 0, 0, 0], name='series')

How would I add a column with a row count since the last >0 number, like so:

pd.DataFrame({
    'series': [1, 0, 0, 1, 0, 0, 0],
    'row_num': [0, 1, 2, 0, 1, 2, 3]
})

Upvotes: 6

Views: 645

Answers (2)

piRSquared
piRSquared

Reputation: 294488

Numpy

  • Find the places where the series/array is greater than 0
  • Calculate the differences from one place to the next
  • Subtract those values from a sequence

i = np.flatnonzero(s)
n = len(s)
delta = np.diff(np.append(i, n))
r = np.arange(n)
r - r[i].repeat(delta)

array([0, 1, 2, 0, 1, 2, 3])

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153500

Try this:

s.groupby(s.cumsum()).cumcount()

Output:

0    0
1    1
2    2
3    0
4    1
5    2
6    3
dtype: int64

Upvotes: 10

Related Questions