Koval  Boris
Koval Boris

Reputation: 27

Random number generator for the vector in R

I have a question when I try to replicate the results, I come up with a problem.

l <- {}
for(i in 1:3){
  set.seed(1)
  l[i] <- rnorm(n = 1, i, i)  
}

this will produce

0.3735462 0.7470924 1.1206386

However, if I write

set.seed(1)
rnorm(n = 3, 1:3, 1:3)
0.3735462 2.3672866 0.4931142

Or

set.seed(1)
rmvnorm(n = 1, 1:3, sqrt(diag(1:3)))
0.3735462 2.21839 1.900251

I don't get the same result. What can a be a problem? My goal is to vectorize the for loop, that's why I come up with a problem.

UPDATE

The answer below explains, how it works for rnorm and should work for all random number generators in R, however when I try this approach with rgig (Generalized Inverse Gaussian Distribution) I have again a problem.

l <- {}
for(i in 1:3){
  set.seed(1)
  l[i] <- rgig(n = i, i, i, i)[i]
}
1.629091 1.500733 1.564364

and if I use

set.seed(1)
rgig(n = 3, 1:3, 1:3, 1:3)
1.629091 1.440166 3.264135

When I use

sapply(1:3,function(x){set.seed(1);rgig(x,x,x,x)})

It doesn't show similar pattern as for rnorm. My assumption that it rgig doesn't support vectorization, since if we write:

set.seed(1)
rgig(n = 3, 1, 1, 1)
1.629091 1.440166 3.264135

What is the same as for vectorized. Am I right?

Upvotes: 2

Views: 642

Answers (2)

Onyambu
Onyambu

Reputation: 79188

For the first method

l <- {}
for(i in 1:3){
  set.seed(1)
  l[i] <- rnorm(n = 1, i, i)  
}
0.3735462 0.7470924 1.1206386

for the second method

set.seed(1)
rnorm(n = 3, 1:3, 1:3)
0.3735462 2.3672866 0.4931142

Your question is why are the two methods not producing the same results?.

Well to answer this i would first say before hand that the two methods DO PRODUCE CONSISTENT RESULTS. now lets see why the values from the pseudorandom generation are different. the simplet way is run a for-loop to see what happens:

sapply(1:3,function(x){set.seed(1);rnorm(x,x,x)})
[[1]]
[1] 0.3735462 #One number produced from mu=1 sd=1    
[[2]]
[1] 0.7470924 2.3672866 # Two numbers produced from mu=2 sd=2    
[[3]]
[1] 1.1206386 3.5509300 0.4931142  # Three numbers produced from mu=3 sd=3    

Now if you look at this list, you will notice that the for loop is just taking the first numbers while the second method just takes the last numbers produced. That is why the numbers are different But in the end, the result is consistent since as you can see, both numbers are produced by the same mean and sd

Thus

 set.seed(1)
 rnorm(3,1:3,1:3)

is equivalent to

l <- {}
for(i in 1:3){
  set.seed(1)
  l[i] <- rnorm(n = i, i, i)[i]  
}
l
[1] 0.3735462 2.3672866 0.4931142
rnorm(3,1:3,1:3)
[1] 0.3735462 2.3672866 0.4931142

Upvotes: 0

Lennyy
Lennyy

Reputation: 6132

With your loop, you do this:

set.seed(1)
rnorm(n = 1, 1, 1)
set.seed(1)
rnorm(n = 1, 2, 2)
set.seed(1)
rnorm(n = 1, 3, 3)

With your 2nd line of code you do this:

set.seed(1)
rnorm(3, 1:3, 1:3)

Hence the different results. In other words: with the loop you do set.seed(1) and randomly pick 1 number 3 times, at first you draw a number from a distribution with a mean and sd of 1, then from a mean and sd of 2 for the 2nd and at last from a mean and sd of 3 for the 3rd.

With the other you sample 3 numbers directly from a vector of means and sd's consisting of 1, 2 and 3. Then the seed was used for one line of code in which all three 3 numbers were generated.

If you would have liked to get the same results with your for loop, you would have needed this code:

> set.seed(1)
> rnorm(n = 1, 1, 1) 
[1] 0.3735462
> set.seed(1)
> rnorm(n = 1, 2, 2)
[1] 0.7470924
> set.seed(1)
> rnorm(n = 1, 3, 3)
[1] 1.120639

Upvotes: 2

Related Questions