Compare 2 uneven size Pandas DataFrames for matching values and combine together, then substitute the NaN value by the closest data

Question

I am new to python and Stackoverflow. I have a problem with handling the data. I have 2 sets of data with different size. df1 has a size of 1000 and df2 has a size of 100000. Here are the samples of df1 and df2.

df1=
        Date                 x     y   
    0   2020-01-01 01:01    1.1   2.4 
    1   2020-01-01 01:05    4.2   5.5  
    2   2020-01-01 01:08    7.3   8.6  

 

df2=
        Date                 x     y
    0   2020-01-01 01:00    NaN   NaN
    1   2020-01-01 01:01    NaN   NaN
    2   2020-01-01 01:02    NaN   NaN
    3   2020-01-01 01:03    NaN   NaN
    4   2020-01-01 01:04    NaN   NaN
    5   2020-01-01 01:05    NaN   NaN
    6   2020-01-01 01:06    NaN   NaN
    7   2020-01-01 01:07    NaN   NaN 
    8   2020-01-01 01:08    NaN   NaN
    9   2020-01-01 01:09    NaN   NaN
   10   2020-01-01 01:10    NaN   NaN

What I would like to do is to combine them together as a new data frame, if df1['Date']=df2['Date'], df3 will be shown as follow.

df3= 
        Date                 x     y
    0   2020-01-01 01:00    NaN   NaN     
    1   2020-01-01 01:01    1.1   2.4 
    2   2020-01-01 01:02    NaN   NaN
    3   2020-01-01 01:03    NaN   NaN
    4   2020-01-01 01:04    NaN   NaN
    5   2020-01-01 01:05    4.2   5.5  
    6   2020-01-01 01:06    NaN   NaN
    7   2020-01-01 01:07    NaN   NaN 
    8   2020-01-01 01:08    7.3   8.6  
    9   2020-01-01 01:09    NaN   NaN
   10   2020-01-01 01:10    NaN   NaN

Then, the NaN values will be equal to the closest value above

df3=
        Date                 x     y
    0   2020-01-01 01:00    NaN   NaN     
    1   2020-01-01 01:01    1.1   2.4 
    2   2020-01-01 01:02    1.1   2.4 
    3   2020-01-01 01:03    1.1   2.4 
    4   2020-01-01 01:04    1.1   2.4 
    5   2020-01-01 01:05    4.2   5.5  
    6   2020-01-01 01:06    4.2   5.5  
    7   2020-01-01 01:07    4.2   5.5  
    8   2020-01-01 01:08    7.3   8.6  
    9   2020-01-01 01:09    7.3   8.6  
   10   2020-01-01 01:10    7.3   8.6

Thx a lot!

Compare 2 uneven size Pandas DataFrames for matching values and combine together, then substitute the NaN value by the closest data

Answers (1)

Related Questions