Reputation: 23758
Here are some sample starting values for variables in the code below.
sd <- 2
sdtheory <- 1.5
meanoftheory <- 0.6
obtained <- 0.8
tails <- 2
I'm trying to vectorize the following code. It is a component of a Bayes factor calculator that was originally written by Dienes and adapted to R by Danny Kaye & Thom Baguley. This part is for calculating the likelihood for the theory. I've got the thing massively sped up by vectorizing but I can't match output of the bit below.
area <- 0
theta <- meanoftheory - 5 * sdtheory
incr <- sdtheory / 200
for (A in -1000:1000){
theta <- theta + incr
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if(identical(tails, 1)){
if (theta <= 0){
dist_theta <- 0
} else {
dist_theta <- dist_theta * 2
}
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- area + height * incr
}
area
And below is the vectorized version.
incr <- sdtheory / 200
newLower <- meanoftheory - 5 * sdtheory + incr
theta <- seq(newLower, by = incr, length.out = 2001)
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if (tails == 1){
dist_theta <- dist_theta[theta > 0] * 2
theta <- theta[theta > 0]
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- sum(height * incr)
area
This code exactly copies the results of the original if tails <- 2
. Everything I've got here so far should just copy and paste and give the exact same results. However, once tails <- 1
the second function no longer matches exactly. But as near as I can tell I'm doing the equivalent in the new if
statement to what is happening in the original. Any help would be appreciated.
(I did try to create a more minimal example, stripping it down to just he loop and if statements and a tiny amount of slices and I just couldn't get the code to fail.)
Upvotes: 1
Views: 347
Reputation: 37764
The original calculation has an error due to floating point arithmetic; adding incr
each time causes theta
to actually equal 7.204654e-14 when it should equal zero. So it's not actually doing the right thing on that pass through the loop; it's not doing the <=
code when it should be. Your code is (at least, it did with these starting values on my machine).
Your code isn't necessarily guaranteed to do the right thing every time either; what seq
does is better than adding an increment over and over again, but it's still floating point arithmetic. You really should probably be checking to within machine tolerance of zero, perhaps using all.equal
or something similar.
Upvotes: 1
Reputation: 176668
You're dropping observations where theta==0
. That's a problem because the output of dnorm
is not zero when theta==0
. You need those observations in your output.
Rather than drop observations, a better solution would be to set those elements to zero.
incr <- sdtheory / 200
newLower <- meanoftheory - 5 * sdtheory + incr
theta <- seq(newLower, by = incr, length.out = 2001)
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if (tails == 1){
dist_theta <- ifelse(theta < 0, 0, dist_theta) * 2
theta[theta < 0] <- 0
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- sum(height * incr)
area
Upvotes: 3