Get the unique elements from the data frame by comparing two column values

Question

I have extracted some elements from regrex and combined them. Now in the final df , I have got two columns in a data frame. I have to get unique elements from f1 column based on f2 column.


df <- as.data.frame(rbind(c('11061002','11862192','11083069'),
                          c(" ",'1234567','452589')))

df$f1 <-paste0(df$V1,
               ',',
               df$V2,
               ',',
               df$V3)            


df_1 <- as.data.frame(rbind(c('11862192'),
                            c('145')))



names(df_1)[1] <-'f2'


df <- as.data.frame(cbind(df,df_1))

df <-df[,c(4,5)]

The expected output is the third column with values : 11061002,11083069 as 11862192 was present in both. ,1234567,452589 as there is not 145 present in second column.

Please guide.

Ronak Shah · Accepted Answer

You can split the string on , in f1 and use setdiff to get values that are not present in f2 after removing empty values.

mapply(function(x, y) toString(setdiff(x[x!=' '], y)), 
                      strsplit(df$f1, ','), df$f2)
#[1] "11061002, 11083069" "1234567, 452589"

If there could be multiple comma-separated values in f2, we can split f2 as well.

mapply(function(x, y) toString(setdiff(x[x!=' '], y)), 
                      strsplit(df$f1, ','), strsplit(df$f2, ','))

Get the unique elements from the data frame by comparing two column values

Answers (2)

Related Questions