Reputation: 151
I have a df as below
df:
Name |Code
-------------------+-----
aman |abc
akash |bcd
rudra |cde
Tushar |def
Kartik |efg
aman,akash |fgh
akash,rudra |ghi
akash,rudra,aman |ijk
aman,Tushar |jkl
Kartik,Tushar |klm
rudra,Kartik,akash |lmn
i want to search code for below df
Name |
----------------+
aman,akash,rudra|
Tushar,aman |
Kartik |
rudra,akash |
to get below result
Name |code
----------------+-----
aman,akash,rudra|ijk
Tushar,aman |jkl
Kartik |efg
rudra,akash |ghi
Please note that combo of "rudra,akash"occur three times, in that case it returns the result alphabatical order of code
let me know if there is some way to achieve this.
Upvotes: 2
Views: 69
Reputation: 887511
We can use cSplit
from splitstackshape
to split the 'Name' column and reshape it to 'long' format ('dfN'), grouped by 'Code', and run-length-id of 'Code', we paste
the 'Name' after sort
ing. Do the same with 'df2', match
the 'Name' columns in both the dataset and get the 'Code' in 'dfN' that corresponds to create a new column 'code' in 'df2'
library(splitstackshape)
dfN <- cSplit(df, "Name", ",", "long")[, .(Name = paste(sort(Name), collapse=",")),
by = .(grp = rleid(Code), Code)]
df2$grp <- seq_len(nrow(df2))
df2$code <- cSplit(df2, "Name", ",", "long")[, .(Name = paste(sort(Name),
collapse=",")), .(grp)][, dfN$Code[match(Name, dfN$Name)]]
df2$grp <- NULL
df2
# Name code
#1 aman,akash,rudra ijk
#2 Tushar,aman jkl
#3 Kartik efg
#4 rudra,akash ghi
Upvotes: 1