Gabelins
Gabelins

Reputation: 285

Improving R script efficiency

I'm trying to write an R script that, as first step, computes dist() and other things for each row of an input matrix and then, as second step of the script, uses each pairs of output matrixes obtained in the step one to make another calculation. My problem is that I'm not able to "conserve" all the matrixes obtained from the step one. Can someone tell me a good strategy?

My code looks like this:

n<- nrow (aa)
output <- matrix (0, n, n)
for (i in 1:n)
{
    for (j in i:n)
    {
        akl<- function (dii){
            ddi<- as.matrix (dii)
            m<- rowMeans(ddi)
            M<- mean(ddi)
            r<- sweep (ddi, 1, m)
            b<- sweep (r, 2, m)
            return (b + M)  
            }
        A<- akl(dist(aa[i,]))
        B<- akl(dist(aa[j,]))
            V <- sqrt ((sqrt (mean(A * A))) * (sqrt(mean(B * B))))
        if (V > 0) {
            output[i,j] <- (sqrt(mean(A * B))) / V else output[i,j] <- 0
            }
    }
}   

I would like to obtain all the resulting matrixes from the akl function and then use them for the rest of the calculation. The script that I show here is to expensive in terms of time because it compute akl everytime and for large input matrix is a problem.

Upvotes: 0

Views: 207

Answers (2)

Dean MacGregor
Dean MacGregor

Reputation: 18416

Now that you've made the improvements in your code look into compiler package. By utilizing compiler with enablejit(3) you MAY shave some time off a script with a lot of looping.

Upvotes: 0

Spacedman
Spacedman

Reputation: 94182

You don't need to recompute A inside the j loop, take it outside.

Also, you don't need to redefine the function inside the loop everytime (assuming it doesn't depend on anything inside the loop).

n<- nrow (aa)
output <- matrix (0, n, n)
akl<- function (dii){
            ddi<- as.matrix (dii)
            m<- rowMeans(ddi)
            M<- mean(ddi)
            r<- sweep (ddi, 1, m)
            b<- sweep (r, 2, m)
            return (b + M)  
            }
for (i in 1:n)
{
    A<- akl(dist(aa[i,]))
    for (j in i:n)
    {
        B<- akl(dist(aa[j,]))
            V <- sqrt ((sqrt (mean(A * A))) * (sqrt(mean(B * B))))
        if (V > 0) {
            output[i,j] <- (sqrt(mean(A * B))) / V else output[i,j] <- 0
            }
    }
}   

Try that, run your tests (you have written tests, right?) and see.

Upvotes: 3

Related Questions