Reputation: 247
assume, that I want to do two simulations with three variables. In the first simulation (lets call it sima) I want to generate three uniform or normal distributed variables, that are uncorrelated. Then I want to to some analysis stuff. After that I want to repeat the analysis, but I want now, that my generated variables from the first simulation (sima) are correlated:
I know, that I can use the mvrnorm function, but I've no idea, how to "correlate" my generated data from the first simulation
For example
a <- rnorm(1000)
b <- rnorm(1000)
c <- rnorm(1000)
x <- matrix(c(a,b,c), ncol=3)
Then I want to correlate the matrix x with for example correlations of:
cor(a,b)=0.4
cor(a,c)=0.3
cor(b,c)=0.5
Upvotes: 2
Views: 804
Reputation:
You could switch it around. First create the correlated data as in DJV's post above. Then decorrelate it by randomly shuffling. This doesn't guarantee you precisely zero correlation in the sample - but that's also true for independently sampled data.
# first create `data` as in DJV's post. Then:
data_indep <- apply(data, 2, sample)
cor(data2)
[,1] [,2] [,3]
[1,] 1.00000000 0.07503708 -0.13515778
[2,] 0.07503708 1.00000000 -0.02912137
[3,] -0.13515778 -0.02912137 1.00000000
To show that on average, the reshuffled data is uncorrelated (which is analytically true, but let's check):
replicate(10000, {data2 <- apply(data, 2, sample); cor(data2)}) -> cors
apply(cors, 1:2, mean)
[,1] [,2] [,3]
[1,] 1.0000000000 -0.0009533055 0.0014867635
[2,] -0.0009533055 1.0000000000 0.0002847576
[3,] 0.0014867635 0.0002847576 1.0000000000
Good enough, I think.
Upvotes: 0
Reputation: 4863
If I understood you correctly, you can use the function MASS::mvrnorm
samples <- 200
rab <- 0.4
rac <- 0.3
rbc <- 0.5
data <- MASS::mvrnorm(n=samples,
mu=c(0, 0, 0),
Sigma=matrix(c(1, rab, rac,
rab, 1, rbc,
rac, rbc, 1),
nrow=3),
empirical=TRUE)
A <- data[, 1]
B <- data[, 2]
C <- data[, 3]
cor(data)
cor(A, B)
cor(A, C)
cor(B, C)
> cor(data)
[,1] [,2] [,3]
[1,] 1.0 0.4 0.3
[2,] 0.4 1.0 0.5
[3,] 0.3 0.5 1.0
> cor(A, B)
[1] 0.4
> cor(A, C)
[1] 0.3
> cor(B, C)
[1] 0.5
Upvotes: 1