I've got a list of 3 lists categorizing things into fruits, vehicles and flowers. category <- structure( list( fruits = c("apple", "banana", "pear", "lemon", "kiwi", "orange"), vehicles = c("car", "bike", "motorbike", "train", "plane"), flowers <- list("rose", "tulip", "sunflower") ), .Names = c( "fruits", "vehicles", "flowers" ) ) Then I've got a dataframe with 2 vectors containing the elements from the lists. Vector a can have any number of objects per cell, vector b just has one element per cell. a <- I(list(c("apple", "car"), c("motorbike", "banana", "tulip"), c("rose", "kiwi", "apple"), c("bike", "sunflower", "lemon"), c("orange"), c("tulip", "pear"))) b <- c("motorbike", "pear", "sunflower", "orange", "car", "apple") funnydata <- data.frame(a, b) I want to create a third vector which gives the element(s) in vector a that's in the same list/category as the element in vector b. So the desired result would be a b c 1 apple, car motorbike car 2 motorbik.... pear banana 3 rose, ki.... sunflower rose 4 bike, su.... orange lemon 5 orange car NA 6 tulip, pear apple pear I manage to get the element in vector a that's in a specific list as long as I leave the list fixed: funnydata$c <- sapply(funnydata$a, function(x) intersect(fruits, unlist(x))) # fixed list funnydata$c [[1]] [1] "apple" [[2]] [1] "banana" [[3]] [1] "apple" "kiwi" [[4]] [1] "lemon" [[5]] [1] "orange" [[6]] [1] "pear" I can also specify the list b is in: sapply(funnydata$b, function(y) names(category[grep(y, category) ])) [1] "vehicles" "fruits" "flowers" "fruits" "vehicles" "fruits" But I'm stuck at combining the two. I get all character(0) if I try funnydata$c <- sapply(funnydata$a, function(x) intersect(sapply(funnydata$b, function(y) category[grep(y, category) ]), unlist(x))) Can somebody help? Edit I noticed a mistake in the original posting: The objects in category are all supposed to be of the same type (vector or list, whichever fits the needs better). so it should be: category <- structure( list( fruits = c("apple", "banana", "pear", "lemon", "kiwi", "orange"), vehicles = c("car", "bike", "motorbike", "train", "plane"), flowers = c("rose", "tulip", "sunflower") ), .Names = c( "fruits", "vehicles", "flowers" ) ) Don't know if that changes anything for the existing answers. I'm still trying to wrap my mind around them. I'm sorry if this copy-and-paste error made things more complicated than they had to be.

Find element in vector a that's in the same list as element in vector b

Reputation: 5263

Most problems concerning data.frames with list columns can be solved by converting those list columns into "flat" vectors.

So we'll convert the two original data.frames into longer versions:

category_df <- data.frame(
  group  = rep(names(category), times = lengths(category)),
  member = unlist(category)
)

category_df
#              group    member
# fruits1     fruits     apple
# fruits2     fruits    banana
# fruits3     fruits      pear
# fruits4     fruits     lemon
# fruits5     fruits      kiwi
# fruits6     fruits    orange
# vehicles1 vehicles       car
# vehicles2 vehicles      bike
# vehicles3 vehicles motorbike
# vehicles4 vehicles     train
# vehicles5 vehicles     plane
# flowers1   flowers      rose
# flowers2   flowers     tulip
# flowers3   flowers sunflower

funnydata[["index"]] <- seq_len(nrow(funnydata))
funny_flat <- data.frame(
  a     = unlist(funnydata[["a"]]),
  b     = rep(funnydata[["b"]], times = lengths(funnydata[["a"]])),
  index = rep(funnydata[["index"]], times = lengths(funnydata[["a"]]))
)

funny_flat
#            a         b index
# 1      apple motorbike     1
# 2        car motorbike     1
# 3  motorbike      pear     2
# 4     banana      pear     2
# 5      tulip      pear     2
# 6       rose sunflower     3
# 7       kiwi sunflower     3
# 8      apple sunflower     3
# 9       bike    orange     4
# 10 sunflower    orange     4
# 11     lemon    orange     4
# 12    orange       car     5
# 13     tulip     apple     6
# 14      pear     apple     6

I also added an index, so we know which values came from which original rows. Now it's just doing a couple simple merges, with some renaming.

funny_flat <- merge(funny_flat, category_df, by.x = "a", by.y = "member")
names(funny_flat)[names(funny_flat) == "group"] <- "group_a"

funny_flat <- merge(funny_flat, category_df, by.x = "b", by.y = "member")
names(funny_flat)[names(funny_flat) == "group"] <- "group_b"

funny_flat
#            b         a index  group_a  group_b
# 1      apple      pear     6   fruits   fruits
# 2      apple     tulip     6  flowers   fruits
# 3        car    orange     5   fruits vehicles
# 4  motorbike     apple     1   fruits vehicles
# 5  motorbike       car     1 vehicles vehicles
# 6     orange      bike     4 vehicles   fruits
# 7     orange     lemon     4   fruits   fruits
# 8     orange sunflower     4  flowers   fruits
# 9       pear motorbike     2 vehicles   fruits
# 10      pear    banana     2   fruits   fruits
# 11      pear     tulip     2  flowers   fruits
# 12 sunflower     apple     3   fruits  flowers
# 13 sunflower      rose     3  flowers  flowers
# 14 sunflower      kiwi     3   fruits  flowers

Now, we'll code your original goal: finding values for which a and b share a category. c will be the value from a, so that's also just a renaming.

funny_matching <- funny_flat[funny_flat[["group_a"]] == funny_flat[["group_b"]], ]
names(funny_matching)[names(funny_flat) == "a"] <- "c"
funny_matching
#            b      c index  group_a  group_b
# 1      apple   pear     6   fruits   fruits
# 5  motorbike    car     1 vehicles vehicles
# 7     orange  lemon     4   fruits   fruits
# 10      pear banana     2   fruits   fruits
# 13 sunflower   rose     3  flowers  flowers

Again, a merge, using the index from before.

merge(
  funnydata,
  funny_matching[, c("c", "index")],
  by = "index",
  all.x = TRUE
)
#   index            a         b      c
# 1     1   apple, car motorbike    car
# 2     2 motorbik....      pear banana
# 3     3 rose, ki.... sunflower   rose
# 4     4 bike, su....    orange  lemon
# 5     5       orange       car   <NA>
# 6     6  tulip, pear     apple   pear

Upvotes: 2

akrun

Reputation: 886978

We can do this with join

library(tidyverse)
dat <-  rownames_to_column(funnydata, 'rn')
catdat <- stack(category)  
dat %>% 
   unnest %>% 
   left_join(catdat, by = c(a = "values")) %>%
   left_join(catdat, by = c(b = "values")) %>%
   filter(ind.x == ind.y) %>% 
   select(rn, c=a) %>% 
   right_join(dat) %>%
   select(names(funnydata), c)
#            a         b      c
#1   apple, car motorbike    car
#2 motorbik....      pear banana
#3 rose, ki.... sunflower   rose
#4 bike, su....    orange  lemon
#5       orange       car   <NA>
#6  tulip, pear     apple   pear

Upvotes: 2

Find element in vector a that's in the same list as element in vector b

Answers (2)

Related Questions

Find element in vector a that&#39;s in the same list as element in vector b

Answers (2)

Related Questions

Find element in vector a that's in the same list as element in vector b