Aspire
Aspire

Reputation: 417

How to generate multivariate normal data in R?

I'm completing an assignment, in which I have to generate a sample X = (X1, X2) from a bivariate normal in which each marginal is N(0,1) and the correlation between X1 and X2 is 0.5.

I think the way to approach this is to use the mvrnorm function, but I'm not quite sure how to proceed after that. Any advice? Thanks in advance!

Upvotes: 24

Views: 34001

Answers (3)

passerby51
passerby51

Reputation: 955

Using base R (no package needed) and a bit of statistics:

Sigma = matrix(c(1,0.5,0.5,1), ncol=2)
R = chol(Sigma) # Sigma == t(R)%*%  R
n = 1000
X = t(R) %*% matrix(rnorm(n*2), 2)

X %*% t(X)/n # test

Upvotes: 7

W. Joel Schneider
W. Joel Schneider

Reputation: 1806

Here are some options:

  1. mvtnorm::rmvnorm and MASS::mvrnorm work the same way, although the mvtnorm::rmvnorm function does not require that you specify the means (i.e., the default is 0). Giving names to the mu vector will specify the names of the simulated variables.
n <- 100
R <- matrix(c(1, 0.5,
              0.5, 1), 
            nrow = 2, ncol = 2, byrow = TRUE)
            
mu <- c(X = 0, Y = 0)
mvtnorm::rmvnorm(n, mean = mu, sigma = R)
MASS::mvrnorm(n, mu = mu, Sigma = R)
  1. simstandard::sim_standardized will make standardized data only, but will do so with less typing:
simstandard::sim_standardized("X ~~ 0.5 * Y", n = 100)

Upvotes: 10

Spherical
Spherical

Reputation: 395

Indeed, the mvrnorm function from the MASS package is probably your best bet. This function can generate pseudo-random data from multivariate normal distributions.

Examining the help page for this function (??mvrnorm) shows that there are three key arguments that you would need to simulate your data based your given parameters, ie:

  • n - the number of samples required (an integer);
  • mu - a vector giving the means of the variables - here, your distributions are standard normal so it will be a vector of zeros; and
  • Sigma - a positive-definite symmetric matrix specifying the covariance matrix of the variables - ie, in your case, a matrix with variance on the diagonal of ones and covariance on the off-diagonals of 0.5.

Have a look at the examples in this help page, which should help you put these ideas together!

Upvotes: 14

Related Questions