Reputation: 541
I am trying to make 1000 simulations to see how many of my f values are in the reject region which is above 1.48 and below .67.
I have this but the variables don't increment as they should:
for (k in 1:1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
if (f > 1.48){
a = 0
a <- a + 1}
if (f < .67){
b = 0
b <- b + 1}
}
a
[1] 1
b
[1] 1
The end goal is find the sum of a and b
I have also tried:
for (k in 1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
a = f > 1.48
b = f < .67
}
y = sum(a)+sum(b)
y
[1] 0
What other way would I increment to get the total amount of f's that are in the reject region?
Upvotes: 0
Views: 3022
Reputation: 6372
In your first example, you are reset-ing a
and b
to zero every time that the if statement is true. Therefore, the max value will always be 1
.
To fix, rearrange those lines:
a = 0 #initialize outside of the loop
b = 0 #initialize outside of the loop
set.seed(1) # added for SO as you are using rnorm, remove this when you run your simulations
for (k in 1:1000){
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
f = (sd(Bdata)^2)/(sd(Adata)^2)
if (f > 1.48){
a <- a + 1}
if (f < .67){
b <- b + 1}
}
I now get a = 13 and b = 29
That said, don't increment variables like this in R. You can take advantage of matrices and vectorized operations.
First Create simulation matrices
set.seed(1)
Adata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100)
Bdata = matrix(data = rnorm(100*1000, mean = 30, sd = 10), nrow = 1000, ncol = 100)
Then calculate your f
score for each line:
f <- apply(Bdata,1,function(x){sd(x)^2})/apply(Adata,1,function(x){sd(x)^2})
now you can simply use:
sum(f > 1.48)
[1] 15
and:
sum(f < .67)
[1] 25
Upvotes: 1
Reputation: 49660
In the first block of code you are resetting a and b to 0 every iteration, then possibly adding 1 (so the most they will ever be is 1 because next iteration they will be set to 0 again).
In the second block you are setting a and b to either TRUE or FALSE, but you are overwriting the value, so you only see the value from the final iteration (actually that loop only runs once with k equal to 1000, but if you had 1:1000 there then you would only see the last iteration).
The simple solution is to move the a=0
and b=0
(or better a <- 0
and b <- 0
) outside of the loop.
The better approach is to use something in the apply
family of functions.
I would suggest something like:
out <- replicate(1000, {
Adata = rnorm(100, mean = 30, sd = 10)
Bdata = rnorm(100, mean = 45, sd = 10)
(sd(Bdata)^2)/(sd(Adata)^2)
})
sum( out > 1.48 )
sum( out < 0.67 )
sum( out > 1.48 | out < 0.67 )
Upvotes: 1