Eigen: Slow access to columns of Matrix 4

Question

I am using Eigen for operations similar to Cholesky update, implying a lot of AXPY (sum plus multiplication by a scalar) on the columns of a fixed size matrix, typically a Matrix4d. In brief, it is 3 times more expensive to access to the columns of a Matrix 4 than to a Vector 4.

Typically, the code below:

for(int i=0;i<4;++i )  L.col(0) += x*y[i];

is 3 times less efficient than the code below:

for(int i=0;i<4;++i )  l4 += x*y[i];

where L is typically a matrix of size 4, x, y and l4 are vectors of size 4.

Moreover, the time spent in the first line of code is not depending on the matrix storage organization (either RowMajor of ColMajor).

On a Intel i7 (2.5GHz), it takes about 0.007us for vector operations, and 0.02us for matrix operations (timings are done by repeating 100000 times the same operation). My application would need thousands of such operation in timings hopefully far below the millisecond.

Question: I am doing something improperly when accessing columns of my 4x4 matrix? Is there something to do to make the first line of code more efficient?

Full code used for timings is below:

#include 
#include 
#include 
#include 

typedef Eigen::Matrix Vector4;
//typedef Eigen::Matrix Matrix4;
typedef Eigen::Matrix Matrix4;

inline double operator- (  const struct timeval & t1,const struct timeval & t0)
{
  /* TODO: double check the double conversion from long (on 64x). */
  return double(t1.tv_sec - t0.tv_sec)+1e-6*double(t1.tv_usec - t0.tv_usec);
}

void sumCols( Matrix4 & L,
              Vector4 & x4,
              Vector4 & y)
{
  for(int i=0;i<4;++i )
    {
      L.col(0) += x4*y[i];
    }
}

void sumVec( Vector4 & L,
             Vector4 & x4,
             Vector4 & y)
{
  for(int i=0;i<4;++i )
    {
      //L.tail(4-i)  += x4.tail(4-i)*y[i];
      L            += x4          *y[i];
    }
}

int main()
{
  using namespace Eigen;

  const int NBT = 1000000;

  struct timeval t0,t1;

  std::vector<     Vector4> x4s(NBT);  
  std::vector<     Vector4> y4s(NBT);  
  std::vector<     Vector4> z4s(NBT);  
  std::vector<     Matrix4> L4s(NBT);  

  for(int i=0;i

Eigen: Slow access to columns of Matrix 4

Answers (1)

Related Questions