Get row-index of the last non-NaN value in each column of a pandas data frame

Question

How can I return the row index location of the last non-nan value for each column of the pandas data frame and return the locations as a pandas dataframe?

EdChum · Accepted Answer

Use notnull and specifically idxmax to get the index values of the non NaN values

In [22]:

df = pd.DataFrame({'a':[0,1,2,NaN], 'b':[NaN, 1,NaN, 3]})
df
Out[22]:
    a   b
0   0 NaN
1   1   1
2   2 NaN
3 NaN   3
In [29]:

df[pd.notnull(df)].idxmax()
Out[29]:
a    2
b    3
dtype: int64

EDIT

Actually as correctly pointed out by @Caleb you can use last_valid_index which is designed for this:

In [3]:
df = pd.DataFrame({'a':[3,1,2,np.NaN], 'b':[np.NaN, 1,np.NaN, -1]})
df

Out[3]:
    a   b
0   3 NaN
1   1   1
2   2 NaN
3 NaN  -1

In [6]:
df.apply(pd.Series.last_valid_index)

Out[6]:
a    2
b    3
dtype: int64

Get row-index of the last non-NaN value in each column of a pandas data frame

Answers (2)

Related Questions