Reputation: 35
Is it possible to find the covariance of a matrix without using any built-in functions or loops in MATLAB? I'm completely clueless about the idea of solving this problem.
I was thinking of something like:
cov(x,y) = 1/(n-1) .* (x*y)
However, I don't think this will work. Any ideas?
Upvotes: 1
Views: 17253
Reputation: 104464
Here's a great example of how to numerically compute the covariance matrix. http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc541.htm. However, let's put this in this post for the sake of completeness. I'm a bit confused with what you mean by "built-in" functions because the covariance requires that you sum over columns of a matrix. If you can't use any built-in functions to sum up these elements, then I don't see how you can do this without using Edit: I figured out how to do it without using built-in functions or loops, but you need to use for
loops. size
to determine how many rows in the matrix you have... unless you specify this as a constant in your function.
Numerically, you compute the covariance matrix like so:
Essentially, the ith row and the jth column of your covariance matrix is such that you take the sum of products of the column i
minus the mean of column i
with column j
minus the mean of column j
. Now, add these up, then divide by n - 1
. This is known as the unbiased estimator. You'll also notice that this matrix is symmetric because even if you flip the order around (i.e. looking at column j
then column i
after), the answer should still be the same. I'm assuming you can't use mean
from MATLAB either so let's do this from first principles.
First, compute a row vector that computes the mean of every column. What you can do to compute the sum over all of the columns without using sum
, as it is also a built-in function, is multiply this row vector of 1s with your matrix A
, The output will be a row vector that contains the sum over all of the columns. As such, do this:
one_vector(1:size(A,1)) = 1;
mu = (one_vector * A) / size(A,1);
The trick with the first line of code is that we are dynamically creating an array that is of the same length as the number of rows in your matrix A
. We fill this completely full of 1s. Note that you could have used ones
, but you said you can't use any built-in functions. mu
will contain our vector over all columns.
Now, let's pre-process the data by subtracting every column with the mean, since that's what the definition says we do. To do this without any built-in functions, what you can do is to subtract all of the columns with their own respective means, duplicate mu
for as many times as we have 1s in the one_vector
. Therefore:
A_mean_subtract = A - mu(one_vector, :);
Here's where it gets a bit tricky (and cool). If we transpose the matrix A
, you'll see that the rows become the columns and the columns become the rows. If we took this transpose and multiplied by the original matrix, we would actually get the sum of products between column i
and column j
of our matrix A
. That's the first part of our covariance calculation. We then divide by n - 1
. Therefore, our covariance is simply:
covA = (A_mean_subtract.' * A_mean_subtract) / (size(A,1) - 1);
Here's a quick example, as well as what is seen on that website I showed you above. Supposing A
was this:
A = [4 2 0.5; 4.2 2.1 0.59; 3.9 2.0 0.58; 4.3 2.1 0.62; 4.1 2.2 0.63]
A =
4.0000 2.0000 0.5000
4.2000 2.1000 0.5900
3.9000 2.0000 0.5800
4.3000 2.1000 0.6200
4.1000 2.2000 0.6300
Running through the above code, this is what we get:
covA =
0.0250 0.0075 0.0042
0.0075 0.0070 0.0034
0.0042 0.0034 0.0026
You'll see that this also matches with the cov
function in MATLAB too:
>> cov(A)
ans =
0.0250 0.0075 0.0042
0.0075 0.0070 0.0034
0.0042 0.0034 0.0026
If you type in edit cov
in your MATLAB command prompt, you can actually see how they compute the covariance matrix without any for
loops.... and this is essentially the same answer I gave you :)
Assuming you can use sum
and bsxfun
, we can do this in fewer (and more efficiently..) lines of code. First, compute your mean vector like we did above using sum
:
mu = sum(A) / size(A,1);
Now, to subtract your matrix A
with each column's corresponding mean, you can use bsxfun
to help you facilitate this subtraction:
A_mean_subtract = bsxfun(@minus, A, mu);
Now, compute your covariance matrix like you did before:
covA = (A_mean_subtract.' * A_mean_subtract) / (size(A,1) - 1);
You should get exactly the same result as we saw before.
We are using the straight up definition of calculating the covariance between two columns using the definition. However, it has been shown that using the straight up definition can lend to numerical instability if you provide certain types of data. Consult this Wikipedia page that goes through various algorithms on computing the covariance between two n
length vectors that are more stable.
Upvotes: 7