bossangelo
bossangelo

Reputation: 87

how to use values in other columns to fill nan of one column in Pandas

I need to fill nan value with values in other columns, for example I have a df like this:

col1, col2, col3, col4
1     nan    nan   nan
2     3      nan   nan
4     nan    5     nan
6     8      nan   9 

I need to turn the df above into

col1, col2, col3, col4
1     nan    nan     1
2       3    nan     2
4     nan      5     4
6     8      nan     9 

I want to iterate through col1 ~ col3, and get the first available value there is for that row and use the value to replace the nan in col4, however if the row in col4 already have value then ignore that row.

I was informed that looping over dataframe is not ideal, what other options do I have?

Upvotes: 0

Views: 534

Answers (3)

fjsh
fjsh

Reputation: 13

You can just use fillna and loop over the names of the column:

for col_name in df.columns[:-1]:
    df['col4'].fillna(df[col_name], inplace=True)

That will give you:

   col1  col2  col3  col4
0     1   NaN   NaN   1.0
1     2   3.0   NaN   2.0
2     4   NaN   5.0   4.0
3     6   8.0   NaN   9.0

Upvotes: 0

Andy L.
Andy L.

Reputation: 25269

Use bfill and fillna

df['col4'] = df['col4'].fillna(df.bfill(1)['col1'])

Out[833]:
   col1  col2  col3  col4
0     1   NaN   NaN   1.0
1     2   3.0   NaN   2.0
2     4   NaN   5.0   4.0
3     6   8.0   NaN   9.0

Upvotes: 1

Pygirl
Pygirl

Reputation: 13349

Try:

df.assign(col4 = df.apply(lambda row: row[row.first_valid_index()], axis=1))

Output:

   col1,col2,col3,col4
0   1.0 NaN NaN 1.0
1   NaN 3.0 NaN 3.0
2   4.0 NaN 5.0 4.0
3   6.0 8.0 NaN 6.0

df.assign(col4 = df.apply(lambda row: row.first_valid_index(), axis=1))

This will give you:

   col1,col2,col3,col4
0   1.0 NaN NaN col1,
1   NaN 3.0 NaN col2,
2   4.0 NaN 5.0 col1,
3   6.0 8.0 NaN col1,

By using those information you can assign the values.

Better use:

df['col4'] = df.apply(
    lambda row: row[row.first_valid_index()] if np.isnan(row['col4']) else row['col4'],
    axis=1
)

This will give you the desired result(since we have to fill NaN of col4)

   col1,col2,col3,col4
0   1.0 NaN NaN 1.0
1   NaN 3.0 NaN 3.0
2   4.0 NaN 5.0 4.0
3   6.0 8.0 NaN 9.0

Upvotes: 1

Related Questions