Iteratively update the values of a dataframe with another one

Question

I have a main df:

print(df)

   item            dt_op     
0  product_1     2019-01-08   
1  product_2     2019-02-08    
2  product_1     2019-01-08        
...

and a subset of the first one, that contains only one product and two extra columns:

print(df_1)

   item            dt_op        DQN_Pred  DQN_Inv
0  product_1     2019-01-08         6      7.0
2  product_1     2019-01-08         2      2.0
...

That I am iteratively creating, with a for loop (hence, df_1 = df.loc[df.item == i] for i in items).

I would like to merge df_1 and df, at every step of the iteration, hence updating df with the two extra columns.

print(final_df)

   item            dt_op      DQN_Pred  DQN_Inv
0  product_1     2019-01-08       6      7.0  
1  product_2     2019-02-08      nan      nan
2  product_1     2019-01-08       2      2.0     
...

and update the nan at the second step of the for loop, in which df_1 only contains product_2.

How can I do it?

anky · Accepted Answer

IIUC, you can use combine_first with reindex:

final_df=df_1.combine_first(df).reindex(columns=df_1.columns)

        item      dt_op  DQN_Pred  DQN_Inv
0  product_1 2019-01-08       6.0      7.0
1  product_2 2019-02-08       NaN      NaN
2  product_1 2019-01-08       2.0      2.0

Alternatively, Using merge , you can use the common keys with left_index and right_index =True:

common_keys=df.columns.intersection(df_1.columns).tolist()
final_df=df.merge(df_1,on=common_keys,left_index=True,right_index=True,how='left')

        item      dt_op  DQN_Pred  DQN_Inv
0  product_1 2019-01-08       6.0      7.0
1  product_2 2019-02-08       NaN      NaN
2  product_1 2019-01-08       2.0      2.0

Iteratively update the values of a dataframe with another one

Answers (1)

Related Questions