Theo C.
Theo C.

Reputation: 3

Vectorization of weighted outer product

I am looking to accelerate the calculation of an approximate weighted covariance.

Specifically, I have a Eigen::VectorXd(N) w and a Eigen::MatrixXd(M,N) points. I'd like to calculate the sum of w(i)*points.col(i)*(points.col(i).transpose()).

I am using a for loop but would like to see if I can go faster:

Eigen::VectorXd w = Eigen::VectorXd(N) ;
Eigen::MatrixXd points = Eigen::MatrixXd(M,N) ;
Eigen::MatrixXd tempMatrix = Eigen::MatrixXd(M,M) ;
for (int i=0; i < N ; i++){
    tempMatrix += w(i)*points.col(i)*(points.col(i).transpose());
}

Looking forward to see what can be done!

Upvotes: 0

Views: 146

Answers (1)

chtz
chtz

Reputation: 18827

The following should work:

Eigen::MatrixXd tempMatrix; // not necessary to pre-allocate
// assigning the product allocates tempMatrix if needed
// noalias() tells Eigen that no factor on the right aliases with tempMatrix
tempMatrix.noalias() = points * w.asDiagonal() * points.adjoint();

or directly:

Eigen::MatrixXd tempMatrix = points * w.asDiagonal() * points.adjoint();

If M is really big, it can be significantly faster to just compute one side and copy it (if needed):

Eigen::MatrixXd tempMatrix(M,M);
tempMatrix.triangularView<Eigen::Upper>() = points * w.asDiagonal() * points.adjoint();
tempMatrix.triangularView<Eigen::StrictlyLower>() = tempMatrix.adjoint();

Note that .adjoint() is equivalent to .transpose() for non-complex scalars, but with the former the code works as well if points and the result where MatrixXcd instead (w must still be real, if the result must be self-adjoint).

Also, notice that the following (from your original code) does not set all entries to zero:

Eigen::MatrixXd tempMatrix = Eigen::MatrixXd(M,M);

If you want this, you need to write:

Eigen::MatrixXd tempMatrix = Eigen::MatrixXd::Zero(M,M);

Upvotes: 1

Related Questions