Using "apply" to increase speed in R

Question

EDITED: I edited the code below in response to some of your suggestions (i got rid of the prop.test function and climbed out of the second circle of hell). Im curious what the next steps would be in terms of making this faster. Should I start using apply or parallel processing, or something else?

My main goal is to make this run faster. As I said, I'm pretty new to this, so I'd appreciate any advice. Thanks for the help!

number.of.trials<-500
n.limit<-1000
final.n.list<-numeric(number.of.trials)

for (trials in 1:number.of.trials){
  p.value<-2
  n<-1
  a<-0
  b<-0

  #this while loop stops once test shows significance or when n reaches the limit
  while ((p.value > .05 | p.value==0) & n<=n.limit) {

    ##add new data points to a and b
    a<-a+rbinom(1, 1, .5)
    b<-b+rbinom(1, 1, .5)

    ##calculate chi-square test statistic with continuity correction
    yates.stat<-2*n*(abs(a*(n-b)-b*(n-a))-n)^2/(n*n*(a+b)*(2*n-a-b)) 

    ##calculate p-value
    p.value<-pchisq(q=yates.stat, df=1, lower.tail=FALSE)
    n<-n+1
  }
  final.n.list[trials]<-n-1
}

Explanation of what I'm trying to do with this code: This is a simulation of an experiment where the two groups (a and b) are tested to see if they are significantly different from each other continually throughout the entire experiment. I would like to demonstrate how the traditional p-value does not work in this context. The experiment ends when both groups have a sample size of 1000 (or if the groups appear to be significantly different at any time during the experiment), and I repeat the entire experiment 500 times.

IRTFM · Accepted Answer

You've already been given several bits of advice that point to the effectiveness of your coding for the CONTENT of your loop. Should you be calling rbinom() to generate single values multiple times? Why not generate a large number, say 1000 at a time, 500 each, and then work through them, possibly with mapply? The people you read who are claiming efficiencies with apply() are really quite wrong. Loops are loops regardless of generation as while() or for() or apply() loops. The key is learning to use vectorized strategies.

Using "apply" to increase speed in R

Answers (1)

Related Questions

Using &quot;apply&quot; to increase speed in R

Answers (1)

Related Questions

Using "apply" to increase speed in R