Pandas: mapping many ids to unique id

Question

I have two data sets:

One that has ids that can change (df1):

|many_id|data1|data2|
 -------------------
|abc    |value|value|
|efg    |value|value|

One that has the unique identifier mapper (df2):

|unique_id|[many_id]      |
 -------------------------
|123      |[hij, abc]     |
|234      |[klm, nop, qrs]|
|345      |[efg]          |

I want to be able to map many_id to unique_id:

|many_id|data1|data2|unique_id|
 -----------------------------
|abc    |value|value|123      |
|efg    |value|value|345      |

In the quickest process possible for example, if it were possible merge on many_id from df1 to [many_id] array from df2.

The method I used was to break many_id down into rows:

|unique_id|many_id|
|123      |hij    |
|123      |abc    |
|234      |klm    |
|234      |nop    |
|234      |qrs    |
|345      |efg    |

And then did a merge from there based on many_id but not sure if that was the most effective way to do so given that I made my dataframe quite a bit larger.

Thanks in advance!

sundance · Accepted Answer

Transform your df2 so that it is a table with each many_id on its own row:

d = df2.set_index("unique_id")["many_id"].apply(pd.Series)
many_ids = d.stack().dropna().to_frame("many_id").reset_index()
df1.join(many_ids.set_index("many_id")["unique_id"], on="many_id")

Result:

  many_id  data1  data2  unique_id
0     abc  value  value        123
1     efg  value  value        345

Pandas: mapping many ids to unique id

Answers (2)

Related Questions