Reputation: 1798
I have a question on the way how to speedup certain things in R. The code example is following:
n=1000
A=matrix(rnorm(n), 1, n)
T=diag(n)
Rprof()
for (i in 1:100)
A = apply(T, 1, function(row) max(A*row)) + matrix(rnorm(n), 1,n)
Rprof(NULL)
summaryRprof()
Profile is following:
> $by.self
self.time self.pct total.time total.pct
"apply" 2.16 37.50 5.74 99.65
"*" 2.06 35.76 2.06 35.76
"aperm.default" 1.08 18.75 1.08 18.75
"max" 0.42 7.29 0.42 7.29
"FUN" 0.02 0.35 2.50 43.40
"rnorm" 0.02 0.35 0.02 0.35
As you can see the slowest thing is function apply. Could you suggest a way to get rid of this function to speedup the whole computation?
Upvotes: 0
Views: 580
Reputation: 59385
Generally you're better off with lists than matrices. In your specific case:
set.seed(1) # for reproducibility
n=1000
A=matrix(rnorm(n), 1, n)
T=diag(n)
set.seed(1) # for reproducibility
system.time({
for (i in 1:100)
X = apply(T, 1, function(row) max(A*row)) + matrix(rnorm(n), 1,n)
})
# user system elapsed
# 5.83 0.13 5.98
df <- as.data.frame(t(T))
set.seed(1) # for reproducibility
system.time({
for (i in 1:100)
Y = sapply(df, function(row) max(A*row)) + matrix(rnorm(n), 1,n)
})
# user system elapsed
# 0.97 0.00 0.96
identical(X,Y)
# [1] TRUE
Upvotes: 2