Mr.Anugar
Mr.Anugar

Reputation: 1

Multiplying without a for loop in R

I am working on a project and I need my function to be as fast as possible since I have millions of datapoints slowing down my calculations. I feel my problem is quite simple, but I haven't been able to find an efficient solution so far. A simplified version of the problem is the following. Consider that you have the following values:

values_test<-c(-8, -9, 2, 1, 1,8,6,2) 

And you have data of the form

c1<-c(1,2,3,4,85,6,9,3,7,7,8,9,7,9,5)
c2<-c(4,6,7,6,3,7,21,79,45,63,4,9,5,7,2)
c3<-c(8,9,21,4,9,6,5,6,3,7,12,7,3,6,7)
c4<-c(11,7,2,9,8,7,6,1,7,9,1,4,8,3,0)
c5<-c(18,2,42,47,1,7,5,5,7,9,11,96,34,63,71)
data<-cbind(c1,c2,c3,c4,c5)

where every column of the data is a variable, and every line is a person. In this example I would have 5 variables, but in real life I may have n variables (whatever positive number of variables as my table changes size).

I would need to use the values_test and multiply them for a respective column of my data object. In my case, the number of values_test changes as well, and it is related to the number of variables available. For instance, consider the example where I need to take the 5th value_test and multiply it by the fourth variable, and the sixth value_test needs to be multiplied by the 5th column. I could do this mannually with a code like

value_person <- values_test[5] * data[,4]+values_test[6]*data[,5]

Although that seems easy, it does not work for me because I do not know how many variables I will have. For instance, if a table includes one more covariate, then my "data" dataframe will have six columns and not five, I would have one more value in values_test. Then, I would need to do

value_person <- values_test[5] * data[,4]+values_test[6]*data[,5]+values_test[7]*data[,6]

and the sum should include more and more terms as the variables in a given table increase. Is there a way to do such an operation without using a for loop?

I thought for instance of something such as

n_col<-ncol(data)
number_variables<-n_col-3##operation does not include 3 first variables
value_person<-rowSums(values_test[(4+1):(4+number_variables)]*data[,4:n_col])

Sadly this does not work because it alternates the value_test used in a column (first row of colum is multiplied by 1 and the second row of the same column by 8, but the values_test should be fixed for a fixed column -in the previous example it should always be 1 for data column 4 and 8 for data column 5).

I do want to avoid having a for loop.

Any help is appreciated!

Upvotes: 0

Views: 76

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101753

Try tcrossprod

> c(tcrossprod(values_test[5:(ncol(data) + 1)], data[, -(1:3)]))
 [1] 155  23 338 385  16  63  46  41  63  81  89 772 280 507 568

Upvotes: 2

Onyambu
Onyambu

Reputation: 79238

m <- seq(4, ncol(data))
data[,m] %*% values_test[m + 1]

      [,1]
 [1,]  155
 [2,]   23
 [3,]  338
 [4,]  385
 [5,]   16
 [6,]   63
 [7,]   46
 [8,]   41
 [9,]   63
[10,]   81
[11,]   89
[12,]  772
[13,]  280
[14,]  507
[15,]  568

 colSums(t(data[,m]) * values_test[m+1])
 [1] 155  23 338 385  16  63  46  41  63  81  89 772 280 507 568

Upvotes: 1

Related Questions