Best way to impute nulls (np.nan) in Pandas DataFrame with values from another dataframe

Question

Let's say I have my main DataFrame.

df = pd.DataFrame({'ID': [1,1,1,2,2,2,3,3,3], 
'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-01', '2021-01-02', '2021-01-03','2021-01-01', '2021-01-02', '2021-01-03'] ,
'Values': [11, np.nan, np.nan, 13, np.nan, np.nan, 15, np.nan, np.nan], 
'Random_Col': [0,0,0,0,0,0,0,0,0]})

I want to fill the np.nan values with values from another dataframe that is not the same shape. The values have to match on "ID" and "Date".

new_df = pd.DataFrame({'ID': [1,1,2,2,3,3], 
'Date': ['2021-01-02', '2021-01-03', '2021-01-02', '2021-01-03','2021-01-02','2021-01-03'],
'Values': [16, 19, 14, 14, 19, 18]})

What's the best way to do this?

I experimented with df.update(), but I'm not that works since the dataframes do not have the same number of rows. Am I wrong about this?

I could also use pd.merge(), but then I end up with multiple versions of each column and have to .fillna() for each specific column with the 2nd column with the new values. This would be fine if I only had 1 column of data to do this for, but I have dozens.

Is there a simpler way that I haven't considered?

Best way to impute nulls (np.nan) in Pandas DataFrame with values from another dataframe

Answers (1)

Related Questions