Reputation: 13
Any suggestions, programmatically or mathematically, for speeding up this calculation in R? I have included some generated data that closely match the real data scenario that I am working with. I have also attempted to use apply and parApply and tried to turn it into a sparse matrix since it has so many 0's, but so far this is the fastest method I have come up with. Any suggestions for making it faster? I need to do this calculations 10,000's of times.
Data that closely match my scenario:
set.seed(7)
# same size matrix as my real data data puzzle
A <- matrix(rbeta((13163*13163),1,1), ncol = 13163, nrow = 13163)
# turn a bunch to 0 to more closely match that I have a lot of 0's in real data
A[A < 0.5] <- 0
# create binary matrix
z <- matrix(rbinom((13163*13163), 1, 0.25), ncol = 13163, nrow = 13163)
I have found that Rfast::rowsums gives me the quickest results.
start1 <- Sys.time()
testA <- 1 - exp(Rfast::rowsums(log(1-A*z)))
stop1 <- Sys.time()
stop1 - start1
Pardon my clunky benchmarking approach...
Upvotes: 1
Views: 95
Reputation: 11738
You can get rid of exp()
and log()
:
testB <- 1 - Rfast::rowprods(1-A*z)
This is 8 times faster.
Yet, as you multiply many number between 0 and 1, you end up having 0s everywhere, so the output vector is all 1s..
Upvotes: 2