bismo
bismo

Reputation: 1439

Pandas - How to backfill a main dataframe with values from another while prioritizing the main dataframe

SET UP MY PROBLEM

I have two pandas dataframes. First, I have main:

import pandas as pd
import numpy as np

main = pd.DataFrame({"foo":{"a":1.0,"b":2.0,"c":3.0,"d":np.nan},"bar":{"a":"a","b":"b","c":np.nan,"d":"d"},"baz":{"a":np.nan,"b":"x","c":"y","d":"z"}})

Picture for example:

enter image description here

I have a second dataframe, called backfill:

backfill = pd.DataFrame({"foo":{"a":9.0,"b":np.nan,"c":7.0,"d":4.0},"bar":{"a":np.nan,"b":"r","c":"c","d":"s"},"baz":{"a":"w","b":"l","c":"m","d":np.nan},"foobar":{"a":"baz","b":np.nan,"c":np.nan,"d":np.nan}})

Picture for example:

enter image description here



WHAT I AM TRYING TO DO

I am trying to backfill any missing values or columns in main with the values from backfill, while prioritizing the values already present in main. That is, I would like to fill any missing value in main with the respective value in backfill AND add any columns and their values not in main with the respective values from backfill. Is there a one line solution to this part of the problem that works well with the solution to the first step I have figured out, or perhaps an entirely different approach that is more pythonic and computationally efficient?



MY DESIRED OUTPUT

Based on my explanation above, this is my desired output:


{"foo":{"a":1,"b":2,"c":3,"d":4},"bar":{"a":"a","b":"b","c":"c","d":"d"},"baz":{"a":"w","b":"x","c":"y","d":"z"},"foobar":{"a":"baz","b":null,"c":null,"d":null}}

Picture for example:

enter image description here

As you can see, the missing values and columns in main are filled with the respective values of backfill while maintaining the populated values of main.



WHAT I HAVE TRIED

The following code seems to satisfy one of the conditions of my problem:

main[main.isnull()] = backfill

{"foo":{"a":1.0,"b":2.0,"c":3.0,"d":4.0},"bar":{"a":"a","b":"b","c":"c","d":"d"},"baz":{"a":"w","b":"x","c":"y","d":"z"}}

But I cannot figure out how to add the foobar column and its values from backfill to main. Of course I could technically merge it, but I would only want to merge the foobar column, and in the irl issue I am facing, I may not always know the name of columns that need to be added to main, which makes merging seem impractical.


Thank you for taking a look at my question.

EDIT(s): grammar, formatting

Upvotes: 0

Views: 12

Answers (0)

Related Questions