RunTheGauntlet
RunTheGauntlet

Reputation: 302

Count how many initial elements in Pandas Series equal to a certain value?

As in question. I know how to compute it, but is there better/faster/more elegant way to do this? Cnt is the result.

s = pd.Series( np.random.randint(2, size=10) )
cnt = 0
for n in s:
        if n != 0:
            break
        else:
            cnt += 1
            continue

Upvotes: 1

Views: 830

Answers (3)

mac13k
mac13k

Reputation: 2663

You can use cumsum() in a mask and then sum() to get the number of initial 0s in the sequence:

s = pd.Series(np.random.randint(2, size=10))
(s.cumsum() == 0).sum()

Note that this method only works if you want to count 0s. If you want to count occurrences of non-zero values you can generalize it, ie.:

(s.sub(s[0]).cumsum() == 0).sum()

Upvotes: 2

David Erickson
David Erickson

Reputation: 16683

I have done in a dataframe as it is easier to produce but you can use the vectorized .cumsum to speed up your code with .loc for values == 0. Then just find the length with len:

import pandas as pd, numpy as np
s = pd.DataFrame(pd.Series(np.random.randint(2, size=10)))
s['t'] = s[0].cumsum()
o = len(s.loc[s['t']==0])
o

If you set o = to a column with s['o'] = o, then the output looks like this:

    0   t   o
0   0   0   2
1   0   0   2
2   1   1   2
3   1   2   2
4   0   2   2
5   1   3   2
6   1   4   2
7   1   5   2
8   1   6   2
9   0   6   2

Upvotes: 2

Shubham Sharma
Shubham Sharma

Reputation: 71687

Use Series.eq to create a boolean mask then use Series.cummin to return a cummulative minimum over this series finally use Series.sum to get the total count:

cnt = s.eq(0).cummin().sum()

Example:

np.random.seed(9)
s = pd.Series(np.random.randint(2, size=10))

# print(s)
0    0
1    0
2    0
3    1
4    0
5    0
6    1
7    0
8    1
9    1
dtype: int64

cnt = s.eq(0).cummin().sum()
#print(cnt)
3

Upvotes: 3

Related Questions