Reputation: 87
I need to fill nan
value with values in other columns, for example
I have a df
like this:
col1, col2, col3, col4
1 nan nan nan
2 3 nan nan
4 nan 5 nan
6 8 nan 9
I need to turn the df
above into
col1, col2, col3, col4
1 nan nan 1
2 3 nan 2
4 nan 5 4
6 8 nan 9
I want to iterate through col1
~ col3
, and get the first available value there is for that row and use the value to replace the nan
in col4
, however if the row in col4
already have value then ignore that row.
I was informed that looping over dataframe is not ideal, what other options do I have?
Upvotes: 0
Views: 534
Reputation: 13
You can just use fillna
and loop over the names of the column:
for col_name in df.columns[:-1]:
df['col4'].fillna(df[col_name], inplace=True)
That will give you:
col1 col2 col3 col4
0 1 NaN NaN 1.0
1 2 3.0 NaN 2.0
2 4 NaN 5.0 4.0
3 6 8.0 NaN 9.0
Upvotes: 0
Reputation: 25269
Use bfill
and fillna
df['col4'] = df['col4'].fillna(df.bfill(1)['col1'])
Out[833]:
col1 col2 col3 col4
0 1 NaN NaN 1.0
1 2 3.0 NaN 2.0
2 4 NaN 5.0 4.0
3 6 8.0 NaN 9.0
Upvotes: 1
Reputation: 13349
Try:
df.assign(col4 = df.apply(lambda row: row[row.first_valid_index()], axis=1))
Output:
col1,col2,col3,col4
0 1.0 NaN NaN 1.0
1 NaN 3.0 NaN 3.0
2 4.0 NaN 5.0 4.0
3 6.0 8.0 NaN 6.0
df.assign(col4 = df.apply(lambda row: row.first_valid_index(), axis=1))
This will give you:
col1,col2,col3,col4
0 1.0 NaN NaN col1,
1 NaN 3.0 NaN col2,
2 4.0 NaN 5.0 col1,
3 6.0 8.0 NaN col1,
By using those information you can assign the values.
Better use:
df['col4'] = df.apply(
lambda row: row[row.first_valid_index()] if np.isnan(row['col4']) else row['col4'],
axis=1
)
This will give you the desired result(since we have to fill NaN of col4)
col1,col2,col3,col4
0 1.0 NaN NaN 1.0
1 NaN 3.0 NaN 3.0
2 4.0 NaN 5.0 4.0
3 6.0 8.0 NaN 9.0
Upvotes: 1