Eliad
Eliad

Reputation: 926

Handling large matrices in C++

I am using large matrices of doubles in C++. I need to get rows or columns from these matrices and pass them to a function. What is the fastest way I can do this?

  1. One way is to write a function that returns a copy of the desired row or column as an std::vector.
  2. Another way is to pass the whole thing as a reference and modify the function to be able to read the desired values.

Are there any other options? Which one do you recommend?

BTW, how do you recommend I store the data in the matrix class? I am using std::vector< std::vector< double > > right now.

EDIT

I mus have mentioned that the matrices might have more that two dimensions. So using boost or arma::mat here is out of the question. Although, I am using armadillo in other parts of the library.

Upvotes: 4

Views: 5316

Answers (5)

user1084944
user1084944

Reputation:

Efficient (multi)linear algebra is a surprisingly deep subject; there are no easy one-size-fits-all answers. The principal challenge is data locality: the memory hardware of your computer is optimized for accessing contiguous regions of memory, and probably cannot operate on anything other than a cache line at a time (and even if it could, the efficiency would go down).

The size of a cache line varies, but think 64 or 128 bytes.

Because of this, it is a non-trivial challenge to lay out the data in a matrix so that it can be accessed efficiently in multiple directions; even more so for higher rank tensors.

And furthermore, the best choices will probably depend heavily on exactly what you're doing with the matrix.

Your question really isn't one that can be satisfactorily answered in a Q&A format like this.

But to at least get you started on researching, here are two keyphrases that may be worth looking into:

  • block matrix
  • fast transpose algorithm

You may well do better to use a library rather than trying to roll your own; e.g. blitz++. (disclaimer: I have not used blitz++)

Upvotes: 2

ThreeStarProgrammer57
ThreeStarProgrammer57

Reputation: 3044

vector<vector<...>> will be slow to allocate, slow to free, and slow to access because it will have more than one dereference (not cache-friendly).

I would recommend it only if your (rows or columns) don't have the same size (jagged arrays).

For a "normal" matrix, you could go for something like:

template <class T, size_t nDim> struct tensor {
    size_t dims[nDim];
    vector<T> vect;
};

and overload operator(size_t i, size_t j, etc.) to access elements.

operator() will have to do index calculations (you have to choose between row-major or column-major order). For nDim > 2, it becomes somewhat complicated, and it could benefit from caching some indexing computations.

To return a row or a column, you could then define sub types.

template <class T, size_t nDim> struct row /*or column*/ {
    tensor<T, nDim> & tensor;
    size_t iStart;
    size_t stride;
}

Then define an operator(size_t i) that will return tensor.vect[iStart + i*stride]

stride value will depend on whether it is a row or a column, and your (row-major or column-major) ordering choice.

stride will be 1 for one of your sub type. Note that for this sub type, iterating will probably be much faster, because it will be cache-friendly. For other sub types, unfortunately, it will probably be rather slow, and there is not much you can do about it.

See other SO questions about why iterating on the rows then the columns will probably have a huge performance difference than iterating on the columns then the rows.

Upvotes: 1

timday
timday

Reputation: 24892

If a variable number of dimensions above 2 is a key requirement, take a look at boost's multidimensional array library. It has efficient (copying free) "views" you can use to reference lower-dimensional "slices" of the full matrix.

The details of what's "fastest" for this sort of thing depends an awful lot on what exactly you're doing, and how the access patterns/working set "footprint" fit to your HW's various levels of cache and memory latency; in practice it can be worth copying to more compact representations to get more cache coherent access, in preference to making sparse strided accesses which just waste a lot of a cache line. Alternatives are Morton-order accessing schemes which can at least amortize "bad axis" effects over all axes. Only your own benchmarking on your own code and use-cases on your HW can really answer that though.

(Note that I wouldn't use Boost.MultiArray for 2 dimensional arrays - there are faster, better options for linear algebra / image processing applications - but for 3+ it's worth considering.)

Upvotes: 3

Yetti99
Yetti99

Reputation: 328

I would use a library like http://arma.sourceforge.net/ Because not only do you get a way to store the matrix. You also have functions that can do operations on it.

Upvotes: 2

Andreas DM
Andreas DM

Reputation: 10998

I recommend you pass it by reference as copying might be a slow process depending on the size. std::vector is fine if you want the ability to expand and contract the container.

Upvotes: 0

Related Questions