Graug
Graug

Reputation: 15

"Partial" matching IDs in two dataframes and merging in R

Some of the data I have been working on is re-ID'd several times. To work on them efficiently, I need to merge df1 and df2 according to their ids. I have tried several approaches based on separate(), grep(), fuzzy_join() but because of id2 of df2 contains longer ids than df1 I couldn't manage to deal with this issue.

Representative df1 and df2 below;

View(df1)

      id1   value1
    N12800  19562
    N11901  403
    N14688  100
    N12886B 32
    T00014  14
    T16487  13


View(df2)

          id2                            value2
N11959_N11901                              56
T03938_N16439_T05162_T05141_N14997         654
N12800                                     1234
N12886B_N12886A                            75
N14688                                     14
T18332_T16487_T13537_T11268_T09399         61

Can you suggest a solution for this "partial" ID matching problem

Upvotes: 0

Views: 53

Answers (1)

bcarlsen
bcarlsen

Reputation: 1441

If you tried separate(), you are already familiar with tidyr. Does lengthening df2 give you what you need to perform the join?

unnest(
  mutate(
    test,
    id2 = strsplit(id2, split = "_")
  ),
  id2
)

Upvotes: 1

Related Questions