Reputation: 7458
I have the following df
,
pct id
0.3 631
0.2 115
0.1 312
0.2 581
0.01 574
0.09 586
I want to first sort the df
by pct
,
df.sort_values(by=['pct'], ascending=False, inplace=True)
then adding up pct
to 0.8 and count how many rows does that, e.g. top 4 rows in this case; I am wondering whats the best way to it. using pd.eval
or pd.query
?
Upvotes: 1
Views: 63
Reputation: 863701
Use cumulative sum by Series.cumsum
like mentioned @Chris A and if need count number of rows use sum
:
out = df['pct'].cumsum().lt(0.8).sum()
print (out)
4
Or filter by boolean indexing
:
df1 = df[df['pct'].cumsum().lt(0.8)]
print (df1)
pct id
0 0.3 631
1 0.2 115
3 0.2 581
2 0.1 312
Upvotes: 3