Dennis Golomazov
Dennis Golomazov

Reputation: 17339

How to get max column with non-zero value in a pandas dataframe

I have a dataframe like this:

           2017      2018      2012  2015  2014  2016
11647  0.044795  0.000000  0.000000   0.0   0.0   0.0
16389  0.089801  0.044900  0.000000   0.0   0.0   0.0
16404  0.014323  0.000000  0.000000   0.0   0.04   0.0
16407  0.052479  0.010442  0.009277   0.0   0.0   0.0
16409  0.000000  0.000000  0.004883   0.0   0.0   5.0

Note that columns are not sorted. For each row, I need to get the latest year with non-zero value. So the expected result is:

11647    2017
16389    2018
16404    2017
16407    2018
16409    2016

How to do that?

Upvotes: 0

Views: 1028

Answers (3)

BENY
BENY

Reputation: 323316

Using stack with max

df[df.ne(0)].stack().reset_index(level=1)['level_1'].max(level=0)
Out[386]: 
11647    2017
16389    2018
16404    2017
16407    2018
16409    2016
Name: level_1, dtype: int64

Just update

df.ne(0).mul(df.columns).max(1)
Out[423]: 
11647    2017.0
16389    2018.0
16404    2017.0
16407    2018.0
16409    2016.0
dtype: float64

Upvotes: 1

rafaelc
rafaelc

Reputation: 59274

Can use idxmax in a sorted-column df

df[sorted(df.columns, reverse=True)].ne(0).idxmax(1)

11647    2017
16389    2018
16404    2017
16407    2018
16409    2016
dtype: object

Upvotes: 2

Dennis Golomazov
Dennis Golomazov

Reputation: 17339

df.apply(lambda row: row[row > 0].index.max(), axis=1)

gives the expected result.

Upvotes: 0

Related Questions