Samuel Perche
Samuel Perche

Reputation: 39

How can I subset an array using the logical row of a data.table?

I would like to do something like this basically:

all_factors <- c('f1',  'f2', 'f3', 'f4' , 'f5' , 'f6')
    factor_perms <- do.call(CJ, replicate(length(all_factors), c(T, F), FALSE))
      for (j in 2:nrow(factor_perms)){
        factors <- all_factors[factor_perms[j,]])
      }

I get: Error in all_factors[factor_perms[j, ]] : invalid subscript type 'list' How can I convert the row to an array? i.e. remove the col names of the data.table

Upvotes: 1

Views: 40

Answers (2)

r2evans
r2evans

Reputation: 160437

If you are interested, you can remove the for loop and do it in one step.

wh <- which(as.matrix(factor_perms), arr.ind = TRUE)
factors <- split(all_factors[wh[,2]], wh[,1])

head(factors)
# $`2`
# [1] "f6"
# $`3`
# [1] "f5"
# $`4`
# [1] "f5" "f6"
# $`5`
# [1] "f4"
# $`6`
# [1] "f4" "f6"
# $`7`
# [1] "f4" "f5"

Note that while wh is column-first,

head(wh)
#      row col
# [1,]  33   1
# [2,]  34   1
# [3,]  35   1
# [4,]  36   1
# [5,]  37   1
# [6,]  38   1

the split step orders its output on the distinct row values, so it is sorted by-row.

The value of this depends on your needs: depending on your comfort with R, one might be easier to read (and therefore maintain) than the other; and this non-for loop is (with this data) 60x as fast as the other. Granted, profiling code on something that takes microseconds is a fool-hardy (inefficient) way to spend your time, but if your data is much larger, this might have an advantage.

  expression          min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory      time     gc      
  <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>      <list>   <list>  
1 r2              116.3us  125.8us     6670.   22.26KB     4.44  3003     2      450ms <NULL> <Rprofmem[~ <bch:tm~ <tibble~
2 ThomasIsCoding   8.46ms   8.95ms      109.    1.02MB     2.09    52     1      478ms <NULL> <Rprofmem[~ <bch:tm~ <tibble~

Upvotes: 2

ThomasIsCoding
ThomasIsCoding

Reputation: 101257

Since factor_perms is a data.table, you need unlist to make it an logical array, e.g.,

all_factors <- c("f1", "f2", "f3", "f4", "f5", "f6")
factor_perms <- do.call(CJ, replicate(length(all_factors), c(T, F), FALSE))
factors <- vector(mode = "list", nrow(factor_perms) - 1)
for (j in 2:nrow(factor_perms)) {
  factors[[j - 1]] <- all_factors[unlist(factor_perms[j, ])]
}

such that

> head(factors)
[[1]]
[1] "f6"

[[2]]
[1] "f5"

[[3]]
[1] "f5" "f6"

[[4]]
[1] "f4"

[[5]]
[1] "f4" "f6"

[[6]]
[1] "f4" "f5"

Upvotes: 2

Related Questions