Reputation: 11728
I do something like this to get:
bMRes += MatrixXd(n, n).setZero()
.selfadjointView<Eigen::Upper>().rankUpdate(bM);
This gets me an incrementation of bMRes
by bM * bM.transpose()
but twice as fast.
Note that bMRes
and bM
are of type Map<MatrixXd>
.
To optimize things further, I would like to skip the copy (and incrementation) of the Lower part. In other words, I would like to compute and write only the Upper part. Again, in other words, I would like to have my result in the Upper part and 0's in the Lower part.
If it is not clear enough, feel free to ask questions.
Thanks in advance.
Florian
Upvotes: 0
Views: 465
Reputation: 9781
If your bMRes
is self-adjoint originally, you could use the following code, which only updates the upper half of bMRes
.
bMRes.selfadjointView<Eigen::Upper>().rankUpdate(bM);
If not, I think you have to accept that .selfadjointView<>()
will always copy the other half when assigned to a MatrixXd
.
Compared to A*A.transpose()
or .rankUpdate(A)
, the cost of copying half of A
can be ignored when A
is reasonably large. So I guess you don't need to optimize your code further.
If you just want to evaluate the difference, you could use low-level BLAS APIs. A*A.transpose()
is equivalent to gemm()
, and .rankUpdate(A)
is equivalent to syrk()
, but syrk()
don't copy the other half automatically.
Upvotes: 3