Reputation: 91
I have a list containing two elements. The first element is a vector while the second element is a list. For the second element, each item corresponds to the item in the first element of the main list. It's best to see the example:
I have a list:
mylist = list(c("w","x","y","z"),
list(
c("a", "b"),
c("c"),
c("a", "c", "b"),
c("d","a","e"))
)
This is what I want:
want = list(list("a",c("w","y","z")),
list("b",c("w","y")),
list("c",c("x","y")),
list("d",c("z")),
list("e",c("z"))
)
This is how I correctly brute-forced the solution:
codes.vec =unique(unlist(list(mylist[[2]])))
new.list=NULL
for(i in 1:5){
new.list[[i]] = list(codes.vec[i],
mylist[[1]][as.logical(unlist(lapply(lapply(mylist[[2]],function(x)
stri_detect_fixed(x,codes.vec[i])),sum)))]
)
}
How can I vectorize this to avoid the loop? I couldn't figure it out but I tried to wrap another *apply() function but I kept getting errors about FUN.VALUE. It currently takes like 15 minutes to solve this for my real data which is impractical.
Note: if it helps, my application is that I have Census data and I know all the ZCTA5 codes (zip codes) that each block group falls into (in whole or in part). I want to flip this idea on its head and get the block groups within each ZCTA5. Simply, I want to go from knowing block group "w" is in ZCTA5 "a" and "b" to having ZCTA code "a" has block group: "w", "y", "z", and ZCTA5 code "b" has block group: "w" and "y".
All help is appreciated, thank you!
edit: variable naming was poor
Upvotes: 0
Views: 52
Reputation: 79208
What you are looking for is - Ignoring the apply family
unstack(stack(setNames(mylist[[2]], mylist[[1]])),ind~values)
$a
[1] "w" "y" "z"
$b
[1] "w" "y"
$c
[1] "x" "y"
$d
[1] "z"
$e
[1] "z"
Upvotes: 2
Reputation: 887048
We could use transpose
from purrr
library(stringr)
library(dplyr)
library(tidyr)
library(purrr)
# // transpose the list
purrr::transpose(list) %>%
# // loop over the list elements and do a crossing
map_dfr(~ crossing(!!! .) %>%
# // set the names of the columns
set_names(str_c('a', seq_along(.)))) %>%
# // split the first column by the second
with(., split(a1, a2))
#$a
#[1] "w" "y" "z"
#$b
#[1] "w" "y"
#$c
#[1] "x" "y"
#$d
#[1] "z"
#$e
#[1] "z"
Upvotes: 2
Reputation: 173793
Here's a solution using Map
:
df <- do.call(rbind, Map(expand.grid, mylist[[1]], mylist[[2]]))
lapply(split(df, df$Var2), function(x) as.character(x[[1]]))
#> $a
#> [1] "w" "y" "z"
#>
#> $b
#> [1] "w" "y"
#>
#> $c
#> [1] "x" "y"
#>
#> $d
#> [1] "z"
#>
#> $e
#> [1] "z"
Upvotes: 2