elitebook190
elitebook190

Reputation: 67

Fuzzy Match two dataframe based on list value column

I have two Dataframes that I am trying to merge by match to values in a column. The column to match are header1 in Df1 and header2 in Df2 (the value is list). I am looking for an adapted approach.

Df1:                                          Df2:
|--------------|---------------|              |--------------|------------------------|       
|      id_1    |     header1   |              |      id_2    |    header2             |
|--------------|---------------|              |--------------|------------------------|
|      112     |      item1    |              |      32      |['item1','item2',...]   |
|--------------|---------------|              |--------------|------------------------|
|      56      |      mass     | 
|--------------|---------------|

Wanted results:

Df_merged:                                       
|--------------|---------------|--------------|------------------------|       
|      id_1    |     header1   |      id_2    |    header2             |
|--------------|---------------|--------------|------------------------|
|      112     |      item1    |      32      |['item1','item2',...]   |
|--------------|---------------|--------------|------------------------|

Does anyone know how to merge these two dataframe ?. I have absolutely no idea how to handle this. Thanks a lot in advance.

Upvotes: 0

Views: 111

Answers (1)

jezrael
jezrael

Reputation: 862406

Use DataFrame.explode with reassign header2 to header1 for avoid lost original column header2 and then use DataFrame.merge:

df = df1.merge(df2.assign(header1 = df2['header2']).explode('header1'), on='header1')

Upvotes: 1

Related Questions