Merge two Pandas data frames on overlapping segments

Question

I have two pandas dataframes and need to match rows if two columns (with start and end co-ordinates) are overlapping without crossing boundaries.

For example:

df_1 = pd.DataFrame(data={'start': [0, 10, 23, 35], 'end': [5, 17, 28, 41], 'some_data_1': ['AA', 'BB', 'CC', 'DD']})
df_2 = pd.DataFrame(data={'start': [0, 12, 23, 55], 'end': [5, 17, 25, 62], 'some_data_2': ['AA_AA', 'BB_BB', 'CC_CC', 'DD_DD']})

Where

df_1 :
    start   end some_data_1
        0     5          AA
       10    17          BB
       23    28          CC
       35    41          DD

and

df_2 :
    start   end some_data_2
        0     5       AA_AA
       12    17       BB_BB
       23    25       CC_CC
       55    62       DD_DD

and the desired output is:

df_1_2 :
    start_1 end_1   start_2 end_2  some_data_1  some_data_2
          0     5         0     5           AA        AA_AA
         10    17        12    17           BB        BB_BB
         23    28        23    25           CC        CC_CC
         35    41       NaN   NaN           DD          NaN
        NaN   NaN        55    62          NaN        DD_DD

Is there an elegant way to check whether one segment (given by end-start) overlaps with another one and if they do, merge data frame on this condition.

Thanks!

Merge two Pandas data frames on overlapping segments

Answers (1)

Related Questions