Merging two columns in a pandas DataFrame

Question

Given the following DataFrame:

      A     B
0 -10.0   NaN
1   NaN  20.0
2 -30.0   NaN

I want to merge columns A and B, filling the NaN cells in column A with the values from column B and then drop column B, resulting in a DataFrame like this:

I have managed to solve this problem by using the iterrows() function.

Complete code example:

import numpy as np
import pandas as pd

example_data = [[-10, np.NaN], [np.NaN, 20], [-30, np.NaN]]

example_df = pd.DataFrame(example_data, columns = ['A', 'B'])

for index, row in example_df.iterrows():
    if pd.isnull(row['A']):
        row['A'] = row['B']

example_df = example_df.drop(columns = ['B'])        

example_df

This seems to work fine, but I find this information in the documentation for iterrows():

You should never modify something you are iterating over.

So it seems like I'm doing it wrong.

What would be a better/recommended approach for achieving the same result?

jezrael · Accepted Answer

Use Series.fillna with Series.to_frame:

df = df['A'].fillna(df['B']).to_frame()
#alternative
#df = df['A'].combine_first(df['B']).to_frame()
print (df)
      A
0 -10.0
1  20.0
2 -30.0

If more columns and need first non missing values per rows use back filling missing values with select first column by one element list for one column DataFrame:

df = df.bfill(axis=1).iloc[:, [0]]
print (df)
      A
0 -10.0
1  20.0
2 -30.0

Merging two columns in a pandas DataFrame

Answers (1)

Related Questions