ayeh
ayeh

Reputation: 68

sapply over vector for each element in a list

I have a large list that includes extracted terms from a corpus.

    mylist <- list(c("flower"), 
               c("plant", "animal", "cats", "doggy"),
               c("tree", "trees", "cat", "dog"))

The extracted terms are from a dataframe (as main words, similar words and categories)

   ref <- data.frame(id = c(1:5), 
                  main = c("tree", "plant", "flower", "dog", "cat"), 
                  similar = c("trees","plantlike", "flowery", "doggy", "cats"),
                  category = c("plant", "plant", "plant", "animal", "animal"))

I need to change the list so that I have categories instead of the words. and maybe remove duplicates like this ...

    needed <- list("plant",
                   c("plant", "animal", "animal", "animal"),
                   c("plant", "plant", "animal", "animal"))
    
    orbetter <- list("plant",
                   c("plant", "animal"),
                   c("plant", "animal"))

but I don't know how to sapply for each element of the list. I appreciate your help.

Upvotes: 0

Views: 44

Answers (1)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

mylist <- list(c("flower"), 
               c("plant", "animal", "cats", "doggy"),
               c("tree", "trees", "cat", "dog"))

ref <- data.frame(id = c(1:5), 
                  main = c("tree", "plant", "flower", "dog", "cat"), 
                  similar = c("trees","plantlike", "flowery", "doggy", "cats"),
                  category = c("plant", "plant", "plant", "animal", "animal"))

library(tidyr)

ref_long <- ref %>% 
  pivot_longer(-c(id, category))

lapply(mylist, function(x) unique(ref_long$category[match(x, table = ref_long$value)]))
#> [[1]]
#> [1] "plant"
#> 
#> [[2]]
#> [1] "plant"  NA       "animal"
#> 
#> [[3]]
#> [1] "plant"  "animal"

Created on 2022-01-14 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions