shasha
shasha

Reputation: 19

How could I calculate an expression across the slices of a 3D-array?

I have calculated this hc index for every 24 values of a 24x24 square matrix (a). It returns a vector (hc) with 24 values.

hc<-vector("numeric",24L)
for (i in 1:24) {
  hc[i]<- -sum((c(a[,i])/colSums(a)[i])*log((c(a[,i])/colSums(a)[i]), base = 2))
  }

I want to calculate this for each of the 1000 matrices of an array and don't know exactly how to proceed (a function?, another nested "for" statement?...). And get 1000, 24length vectors. One for each matrix in the array. I would really appreciate any help. Thanks!

Upvotes: 0

Views: 96

Answers (2)

Mikko Marttila
Mikko Marttila

Reputation: 11898

Here's an alternative approach, if you actually have an array:

# Simplified version of your function
f <- function(x) -sum(x / sum(x) * log2(x / sum(x)))

set.seed(1)
n <- 1000

matrices <- replicate(n, matrix(runif(24 ^ 2), nrow = 24))
str(matrices)
#>  num [1:24, 1:24, 1:1000] 0.266 0.372 0.573 0.908 0.202 ...

result <- apply(matrices, c(2, 3), f)
str(result)
#>  num [1:24, 1:1000] 4.36 4.36 4.37 4.36 4.34 ...

If your matrices are in a list:

matrix_list <- lapply(seq_len(n), function(i) matrices[, , i])
list_result <- lapply(matrix_list, apply, 2, f)
str(head(list_result))
#> List of 6
#>  $ : num [1:24] 4.36 4.36 4.37 4.36 4.34 ...
#>  $ : num [1:24] 4.43 4.32 4.31 4.4 4.37 ...
#>  $ : num [1:24] 4.26 4.21 4.31 4.24 4.24 ...
#>  $ : num [1:24] 4.31 4.36 4.27 4.32 4.35 ...
#>  $ : num [1:24] 4.39 4.27 4.35 4.29 4.22 ...
#>  $ : num [1:24] 4.25 4.42 4.19 4.32 4.33 ...

Created on 2018-03-05 by the reprex package (v0.2.0).

Upvotes: 1

alan ocallaghan
alan ocallaghan

Reputation: 3038

Firstly, your code is very inefficient. You are calculating sum of every column in the matrix twice in every iteration. It would be better to calculate all of these at the beginning, or simply to calculate a sum when needed.

The following is more "R-like" code that should also be faster:

mat <- matrix(runif(100, 1, 100), nrow = 24, ncol = 24)


mat_res <- sapply(seq_len(ncol(mat)), 
    function(i, mat) {
        col <- mat[, i]
        s <- sum(col)
        sc <- col / s
        l <- log2(col) / s
        - sum(sc * l)
    }, 
    mat = mat)

Now, to apply this to all 1000 matrices? First, make a list of 1000 matrices:

list_of_mats <- lapply(1:1000, 
    function(i) matrix(runif(100, 1, 100), nrow = 24, ncol = 24))

Then, apply a function over each of these matrices:

lapply(list_of_mats, 
    function(mat) {
        ## Note that this process is the same as above!
        sapply(seq_len(ncol(mat)), 
            function(i, mat) {
                col <- mat[, i]
                s <- sum(col)
                sc <- col / s
                l <- log2(col) / s
                - sum(sc * l)
            }, 
            mat = mat)
    })

Upvotes: 0

Related Questions