Robin
Robin

Reputation: 33

rnorm is generating non-random looking realizations

I was debugging my simulation and I find that when I run rnorm(), my random normal values don't look random to me at all. ccc is the mean sd vector that is given parametrically. How can I get really random normal realizations? Since my original simulation is quite long, I don't want to go into Gibbs sampling... Should you know why I get non-random looking realizations of normal random variables?

> ccc  
# [1] 144.66667  52.52671  

> rnorm(20, ccc)  
# [1] 144.72325  52.31605 144.44628  53.07380 144.64438  53.87741 144.91300  54.06928 144.76440  
# [10]  52.09181 144.61817  52.17339 145.01374  53.38597 145.51335  52.37353 143.02516  52.49332  
# [19] 144.27616  54.22477

> rnorm(20, ccc)  
# [1] 143.88539  52.42435 145.24666  50.94785 146.10255  51.59644 144.04244  51.78682 144.70936  
# [10]  53.51048 143.63903  51.25484 143.83508  52.94973 145.53776  51.93892 144.14925  52.35716  
# [19] 144.08803  53.34002 

Upvotes: 3

Views: 156

Answers (1)

Darren Tsai
Darren Tsai

Reputation: 35584

It's a basic concept to set parameters in a function. Take rnorm() for example:

Its structure is rnorm(n, mean = 0, sd = 1). Obviously, mean and sd are two different parameters, so you need to put respective values to them. Here is a confusing situation where you may get stuck:

arg <- c(5, 10)
rnorm(1000, arg)

This actually means rnorm(n = 1000, mean = c(5, 10), sd = 1). The standard deviation is set to 1 because the position of arg represents the parameter mean and you don't set sd additionally. Therefore, rnorm() will take the default value 1 to sd. However, what does mean = c(5, 10) mean? Let's check:

x <- rnorm(1000, arg)
hist(x, breaks = 50, prob = TRUE)
# lines(density(x), col = 2, lwd = 2)

enter image description here

mean = c(5, 10) and sd = 1 will recycle to length 1000, i.e.

rnorm(n = 1000, mean = c(5, 10, 5, 10, ...), sd = c(1, 1, 1, 1, ...))

and hence the final sample x is actually a blend of 500 N(5, 1) samples and 500 N(10, 1) samples which are drawn alternately, i.e.

c(rnorm(1, 5, 1), rnorm(1, 10, 1), rnorm(1, 5, 1), rnorm(1, 10, 1), ...)

As for your question, it should be:

arg <- c(5, 10)
rnorm(1000, arg[1], arg[2])

and this means rnorm(n = 1000, mean = 5, sd = 10). Check it again, and you will get a normal distribution with mean = 5 and sd = 10.

x <- rnorm(1000, arg[1], arg[2])
hist(x, breaks = 50, prob = T)
# curve(dnorm(x, arg[1], arg[2]), col = 2, lwd = 2, add = T)

enter image description here

Upvotes: 5

Related Questions