Reputation: 11
I have the following r code:
list1 <- c(5, 6, 8, 10, 15, 26, 75)
list2 <- c(3, 6, 8, 10, 100, 42)
total <- length(list1)*length(list2)
for(x in 1:length(list1)) {
for(y in 1:length(list2)) {
print(total - (x*y))
if(list1[x]>list2[y]) {
l1Bigger <- l1Bigger + 1
} else if(list1[x]<list2[y]) {
l2Bigger <- l2Bigger + 1
} else {
tie <- tie + 1
}
}
}
percents <- c(l1Bigger/total, l2Bigger/total, tie/total)
return(percents)
Basically, what I want my code to do is iterate through list1 and list2 and compare the values to figure out how often the values in list1 are greater than the values in list 2. My current method takes a lot of time, is there any way to reduce the amount of time this process takes?
Thank you!
Upvotes: 1
Views: 49
Reputation: 26248
You can convert what you've got into Rcpp
which should speed up the process on long vectors
library(Rcpp)
set.seed(1)
v1 <- rnorm(10000)
v2 <- rnorm(10000)
cppFunction('NumericVector compareVectors(NumericVector v1, NumericVector v2){
NumericVector out(3);
for(int i = 0; i < v1.size(); i++){
for(int j = 0; j < v2.size(); j++){
if(v1[i] == v2[j]){
out[0]++;
}else if(v1[i] < v2[j]){
out[1]++;
}else{
out[2]++;
}
}
}
return out;
}')
compareVectors(v1, v2)
[1] 0 5008309906 4991690094
which shows favourable results when benchmarked
library(microbenchmark)
set.seed(1)
v1 <- rnorm(1000)
v2 <- rnorm(1000)
microbenchmark(
rcpp = {
compareVectors(v1, v2)
},
exg = {
g <- expand.grid(v1, v2)
x.bigger <- sum(g$Var1 > g$Var2)
y.bigge <- sum(g$Var1 < g$Var2)
}
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# rcpp 5.600956 5.788145 6.036816 5.927468 6.183143 8.385282 100
# exg 28.529272 35.246216 41.328205 36.000421 37.653801 540.850561 100
Upvotes: 2
Reputation: 51998
expand.grid
is a natural way to do this sort of thing:
> x <- c(2,4,5,1,3)
> y <- c(1,6,2,3)
> g <- expand.grid(x,y)
> x.bigger <- sum(g$Var1 > g$Var2)
> y.bigger <- sum(g$Var1 < g$Var2)
> ties <- sum(g$Var1 == g$Var2)
> x.bigger
[1] 9
> y.bigger <- sum(g$Var1 < g$Var2)
> ties
[1] 3
Of course, ties
can just be computed via simple arithmetic from the other two values, but I wanted to show how you could get all three numbers directly.
Upvotes: 2