Do setdiff pairwise on two columns that are lists of character vectors

Question

I have a data frame that has a column 1 which is a nested list of character vectors. Column 2 is another similar list. I would like to define column 3 which has the elements of each character vector present in column 1 but not in column 2.

Like this:

c1 c2 new c
c('a', 'b') c('b', 'd') 'a'

I tried mapping setdiff:

my.tibble = tibble(
  c1 = list(c('a', 'b'), c('b', 'c')),
  c2 = list(c('d', 'e'), c('e', 'f'))
)

my.tibble = my.tibble %>% 
  mutate(
    new.c = map(c1, ~ setdiff(., c2))
  )

my.tibble$new.c

It copies c1 intact.

If I do it rowwise, it looks like it runs setdiff for every value in the vectors in c2.

my.tibble = my.tibble %>% 
  rowwise() %>%
  mutate(
    new.c = map(c1, ~ setdiff(., c2))
  )

my.tibble$new.c

I suspect I'm being screwed over by list structure, but I'm not sure how.

Ronak Shah · Accepted Answer

Not the best example shared since there is no overlapping in col1 and col2 but I guess you are looking for map2.

library(dplyr)
library(purrr)

my.tibble %>%  mutate(new.c = map2(c1, c2, setdiff))

Or Map in base R.

my.tibble$new.c <- Map(setdiff, my.tibble$c1, my.tibble$c2)

Do setdiff pairwise on two columns that are lists of character vectors

Answers (1)

Related Questions