user12346170
user12346170

Reputation: 98

return first column number that fulfills a condition in pandas

I have a dataset with several columns of cumulative sums. For every row, I want to return the first column number that satisfies a condition.

Toy example:

df = pd.DataFrame(np.array(range(20)).reshape(4,5).T).cumsum(axis=1)

>>> df
   0   1   2   3
0  0   5  15  30
1  1   7  18  34
2  2   9  21  38
3  3  11  24  42
4  4  13  27  46

If I want to return the first column whose value is greater than 20 for instance.

Desired output:

3
3
2
2
2

Many thanks as always!

Upvotes: 2

Views: 177

Answers (2)

wwnde
wwnde

Reputation: 26676

No as short as @YOBEN_S but works is the chaining of index.get_loc and first_valid_index

df[df>20].apply(lambda x: x.index.get_loc(x.first_valid_index()), axis=1)
0    3
1    3
2    2
3    2
4    2
dtype: int64

Upvotes: 1

BENY
BENY

Reputation: 323376

Try with idxmax

df.gt(20).idxmax(1)
Out[66]: 
0    3
1    3
2    2
3    2
4    2
dtype: object

Upvotes: 3

Related Questions