Reputation: 302
As in question. I know how to compute it, but is there better/faster/more elegant way to do this? Cnt is the result.
s = pd.Series( np.random.randint(2, size=10) )
cnt = 0
for n in s:
if n != 0:
break
else:
cnt += 1
continue
Upvotes: 1
Views: 830
Reputation: 2663
You can use cumsum()
in a mask and then sum()
to get the number of initial 0s in the sequence:
s = pd.Series(np.random.randint(2, size=10))
(s.cumsum() == 0).sum()
Note that this method only works if you want to count 0s. If you want to count occurrences of non-zero values you can generalize it, ie.:
(s.sub(s[0]).cumsum() == 0).sum()
Upvotes: 2
Reputation: 16683
I have done in a dataframe as it is easier to produce but you can use the vectorized .cumsum
to speed up your code with .loc
for values == 0. Then just find the length with len
:
import pandas as pd, numpy as np
s = pd.DataFrame(pd.Series(np.random.randint(2, size=10)))
s['t'] = s[0].cumsum()
o = len(s.loc[s['t']==0])
o
If you set o
= to a column with s['o'] = o
, then the output looks like this:
0 t o
0 0 0 2
1 0 0 2
2 1 1 2
3 1 2 2
4 0 2 2
5 1 3 2
6 1 4 2
7 1 5 2
8 1 6 2
9 0 6 2
Upvotes: 2
Reputation: 71687
Use Series.eq
to create a boolean mask
then use Series.cummin
to return a cummulative minimum over this series finally use Series.sum
to get the total count:
cnt = s.eq(0).cummin().sum()
Example:
np.random.seed(9)
s = pd.Series(np.random.randint(2, size=10))
# print(s)
0 0
1 0
2 0
3 1
4 0
5 0
6 1
7 0
8 1
9 1
dtype: int64
cnt = s.eq(0).cummin().sum()
#print(cnt)
3
Upvotes: 3