Reputation: 2372
Given the following DataFrame
:
A B
0 -10.0 NaN
1 NaN 20.0
2 -30.0 NaN
I want to merge columns A
and B
, filling the NaN
cells in column A
with the values from column B
and then drop column B
, resulting in a DataFrame
like this:
A
0 -10.0
1 20.0
2 -30.0
I have managed to solve this problem by using the iterrows()
function.
Complete code example:
import numpy as np
import pandas as pd
example_data = [[-10, np.NaN], [np.NaN, 20], [-30, np.NaN]]
example_df = pd.DataFrame(example_data, columns = ['A', 'B'])
for index, row in example_df.iterrows():
if pd.isnull(row['A']):
row['A'] = row['B']
example_df = example_df.drop(columns = ['B'])
example_df
This seems to work fine, but I find this information in the documentation for iterrows()
:
You should never modify something you are iterating over.
So it seems like I'm doing it wrong.
What would be a better/recommended approach for achieving the same result?
Upvotes: 1
Views: 233
Reputation: 863501
Use Series.fillna
with Series.to_frame
:
df = df['A'].fillna(df['B']).to_frame()
#alternative
#df = df['A'].combine_first(df['B']).to_frame()
print (df)
A
0 -10.0
1 20.0
2 -30.0
If more columns and need first non missing values per rows use back filling missing values with select first column by one element list for one column DataFrame
:
df = df.bfill(axis=1).iloc[:, [0]]
print (df)
A
0 -10.0
1 20.0
2 -30.0
Upvotes: 4