Ryan Fasching
Ryan Fasching

Reputation: 541

Increment inside for loop? in r

I am trying to make 1000 simulations to see how many of my f values are in the reject region which is above 1.48 and below .67.

I have this but the variables don't increment as they should:

for (k in 1:1000){
   Adata = rnorm(100, mean = 30, sd = 10)
   Bdata = rnorm(100, mean = 45, sd = 10)
   f = (sd(Bdata)^2)/(sd(Adata)^2)
   if (f > 1.48){
     a = 0
     a <- a + 1}
   if (f < .67){
     b = 0
     b <- b + 1}
} 

 a
 [1] 1
 b
 [1] 1

The end goal is find the sum of a and b

I have also tried:

 for (k in 1000){
   Adata = rnorm(100, mean = 30, sd = 10)
   Bdata = rnorm(100, mean = 45, sd = 10)
   f = (sd(Bdata)^2)/(sd(Adata)^2)
   a = f > 1.48
   b = f < .67
   }
 y = sum(a)+sum(b)
 y
 [1] 0

What other way would I increment to get the total amount of f's that are in the reject region?

Upvotes: 0

Views: 3022

Answers (2)

Chris
Chris

Reputation: 6372

In your first example, you are reset-ing a and b to zero every time that the if statement is true. Therefore, the max value will always be 1.

To fix, rearrange those lines:

a = 0 #initialize outside of the loop
b = 0 #initialize outside of the loop
set.seed(1) # added for SO as you are using rnorm, remove this when you run your simulations
for (k in 1:1000){
   Adata = rnorm(100, mean = 30, sd = 10)
   Bdata = rnorm(100, mean = 45, sd = 10)
   f = (sd(Bdata)^2)/(sd(Adata)^2)
   if (f > 1.48){
     a <- a + 1}
   if (f < .67){
     b <- b + 1}
}

I now get a = 13 and b = 29

That said, don't increment variables like this in R. You can take advantage of matrices and vectorized operations.

First Create simulation matrices

set.seed(1)
Adata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100) 
Bdata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100)

Then calculate your f score for each line:

f <- apply(Bdata,1,function(x){sd(x)^2})/apply(Adata,1,function(x){sd(x)^2})

now you can simply use:

sum(f > 1.48)
[1] 15

and:

sum(f < .67)
[1] 25

Upvotes: 1

Greg Snow
Greg Snow

Reputation: 49660

In the first block of code you are resetting a and b to 0 every iteration, then possibly adding 1 (so the most they will ever be is 1 because next iteration they will be set to 0 again).

In the second block you are setting a and b to either TRUE or FALSE, but you are overwriting the value, so you only see the value from the final iteration (actually that loop only runs once with k equal to 1000, but if you had 1:1000 there then you would only see the last iteration).

The simple solution is to move the a=0 and b=0 (or better a <- 0 and b <- 0) outside of the loop.

The better approach is to use something in the apply family of functions.

I would suggest something like:

out <- replicate(1000, {
   Adata = rnorm(100, mean = 30, sd = 10)
   Bdata = rnorm(100, mean = 45, sd = 10)
   (sd(Bdata)^2)/(sd(Adata)^2)
  })

sum( out > 1.48 )
sum( out < 0.67 )

sum( out > 1.48 | out < 0.67 )

Upvotes: 1

Related Questions