ctde
ctde

Reputation: 91

How to vectorize a loop around two lapply calls over a list

I have a list containing two elements. The first element is a vector while the second element is a list. For the second element, each item corresponds to the item in the first element of the main list. It's best to see the example:

I have a list:

mylist  = list(c("w","x","y","z"),
             list(
                 c("a", "b"),
                 c("c"),
                 c("a", "c", "b"),
                 c("d","a","e"))
            )

This is what I want:

 want = list(list("a",c("w","y","z")),
             list("b",c("w","y")),
             list("c",c("x","y")),
             list("d",c("z")),
             list("e",c("z"))
            )

This is how I correctly brute-forced the solution:

codes.vec =unique(unlist(list(mylist[[2]])))

new.list=NULL
for(i in 1:5){
   new.list[[i]] = list(codes.vec[i],
                        mylist[[1]][as.logical(unlist(lapply(lapply(mylist[[2]],function(x) 
   stri_detect_fixed(x,codes.vec[i])),sum)))]
                    )
} 

How can I vectorize this to avoid the loop? I couldn't figure it out but I tried to wrap another *apply() function but I kept getting errors about FUN.VALUE. It currently takes like 15 minutes to solve this for my real data which is impractical.

Note: if it helps, my application is that I have Census data and I know all the ZCTA5 codes (zip codes) that each block group falls into (in whole or in part). I want to flip this idea on its head and get the block groups within each ZCTA5. Simply, I want to go from knowing block group "w" is in ZCTA5 "a" and "b" to having ZCTA code "a" has block group: "w", "y", "z", and ZCTA5 code "b" has block group: "w" and "y".

All help is appreciated, thank you!

edit: variable naming was poor

Upvotes: 0

Views: 52

Answers (3)

Onyambu
Onyambu

Reputation: 79208

What you are looking for is - Ignoring the apply family

unstack(stack(setNames(mylist[[2]], mylist[[1]])),ind~values)
$a
[1] "w" "y" "z"

$b
[1] "w" "y"

$c
[1] "x" "y"

$d
[1] "z"

$e
[1] "z"

Upvotes: 2

akrun
akrun

Reputation: 887048

We could use transpose from purrr

library(stringr)
library(dplyr)
library(tidyr)
library(purrr)
# // transpose the list
purrr::transpose(list) %>% 
      # // loop over the list elements and do a crossing
      map_dfr(~ crossing(!!! .) %>%
      # // set the names of the columns
      set_names(str_c('a', seq_along(.)))) %>%
      # // split the first column by the second
      with(., split(a1, a2))
#$a
#[1] "w" "y" "z"

#$b
#[1] "w" "y"

#$c
#[1] "x" "y"

#$d
#[1] "z"

#$e
#[1] "z"

Upvotes: 2

Allan Cameron
Allan Cameron

Reputation: 173793

Here's a solution using Map:

df <- do.call(rbind, Map(expand.grid, mylist[[1]], mylist[[2]]))
lapply(split(df, df$Var2), function(x) as.character(x[[1]]))
#> $a
#> [1] "w" "y" "z"
#> 
#> $b
#> [1] "w" "y"
#> 
#> $c
#> [1] "x" "y"
#> 
#> $d
#> [1] "z"
#> 
#> $e
#> [1] "z"

Upvotes: 2

Related Questions