Reputation: 97
I have extracted some elements from regrex and combined them. Now in the final df , I have got two columns in a data frame. I have to get unique elements from f1 column based on f2 column.
df <- as.data.frame(rbind(c('11061002','11862192','11083069'),
c(" ",'1234567','452589')))
df$f1 <-paste0(df$V1,
',',
df$V2,
',',
df$V3)
df_1 <- as.data.frame(rbind(c('11862192'),
c('145')))
names(df_1)[1] <-'f2'
df <- as.data.frame(cbind(df,df_1))
df <-df[,c(4,5)]
The expected output is the third column with values : 11061002,11083069 as 11862192 was present in both. ,1234567,452589 as there is not 145 present in second column.
Please guide.
Upvotes: 0
Views: 36
Reputation: 887951
We can use tidyverse
library(dplyr)
library(tidyr)
df %>%
mutate(rn = row_number()) %>%
separate_rows(f1, f2) %>%
group_by(rn)%>%
summarise(new = toString(setdiff(setdiff(f1, f2), ""))) %>%
select(-rn) %>%
bind_cols(df, .)
# A tibble: 2 x 6
# V1 V2 V3 f1 f2 new
# <chr> <chr> <chr> <chr> <chr> <chr>
#1 "11061002" 11862192 11083069 "11061002,11862192,11083069" 11862192 11061002, 11083069
#2 " " 1234567 452589 " ,1234567,452589" 145 1234567, 452589
Upvotes: 0
Reputation: 389325
You can split the string on ,
in f1
and use setdiff
to get values that are not present in f2
after removing empty values.
mapply(function(x, y) toString(setdiff(x[x!=' '], y)),
strsplit(df$f1, ','), df$f2)
#[1] "11061002, 11083069" "1234567, 452589"
If there could be multiple comma-separated values in f2
, we can split f2
as well.
mapply(function(x, y) toString(setdiff(x[x!=' '], y)),
strsplit(df$f1, ','), strsplit(df$f2, ','))
Upvotes: 3