Cedric Schmith
Cedric Schmith

Reputation: 3

Get Column Names of Values that are not NaN as new column

I have the dataframe below and I want another column that displays the column names of the values that are not NaN. Like this:

New Column
y32
y9
y11
y9

and so on.

          y9        y2       y11       y32
1        NaN       NaN       NaN  0.201229
2   0.053303       NaN       NaN       NaN
4        NaN       NaN  0.442415       NaN
6   0.000529       NaN       NaN       NaN

I tried to do it with for loops but couldn't quite get there. Help is very appreciated.

Upvotes: 0

Views: 1338

Answers (2)

Pablo C
Pablo C

Reputation: 4771

Here are a couple approaches without using pandas.DataFrame.apply:

  1. Using pandas.DataFrame.idxmax with axis = 1 (columns)
>>> df["New Column"] = df.idxmax(1)
  1. Using numpy.where and pandas.DataFrame.notna
>>> df["New Column"] = df.columns[np.where(df.notna())[1]]

In both cases the resulting dataframe is:

         y9  y2       y11       y32 New Column
1       NaN NaN       NaN  0.201229        y32
2  0.053303 NaN       NaN       NaN         y9
4       NaN NaN  0.442415       NaN        y11
6  0.000529 NaN       NaN       NaN         y9

Upvotes: 2

Yevhen Kuzmovych
Yevhen Kuzmovych

Reputation: 12140

You can use idxmax with skipna=True and apply it to each row:

def f(r):
     return r.idxmax(skipna=True)

df['New Column'] = df.apply(f, axis=1)

Or a one-liner:

df['New Column'] = df.apply(lambda r: r.idxmax(skipna=True), axis=1)

UPD:

A better way would be to use Series.first_valid_index():

df['New Column'] = df.apply(pd.Series.first_valid_index, axis=1)

Upvotes: 0

Related Questions