Reputation: 230
I am trying to decompose a large covariance matrix as part of a portfolio optimisation in R, but it doesn't seem to work for matrices with more than ~100 variables. Here is an example with some random data:
set.seed(123)
a <- NULL
for(i in 1:300) {
a <- cbind(a, rnorm(100))
}
chol(cov(a))
Error in chol.default(cov(a)) :
the leading minor of order 101 is not positive definite
but if I reduce the sample it works fine:
b <- NULL
for(i in 1:100) {
b <- cbind(b, rnorm(100))
}
chol(cov(b))
I've tried it a few times, and the error seems to occur for covariance matrices with more than 100 to 105 variables. Does anyone know the source of this problem?
EDIT 2: This works:
set.seed(123)
c <- NULL
for(i in 1:300) {
c <- cbind(c, rnorm(300))
}
chol(cov(c))
But this does not:
d <- NULL
for(i in 1:301) {
d <- cbind(d, rnorm(300))
}
chol(cov(d))
Error in chol.default(cov(d)) :
the leading minor of order 300 is not positive definite
So the number of variables may not exceed the number of observations?
Upvotes: 1
Views: 4079
Reputation: 1974
If you only have a 100 observations, your covariance matrix will have rank at most 100. This is the reason that the cholesky factorization is failing.
Upvotes: 3