Reputation: 119
This is primarily a C++ conceptual question. If I have a certain matrix (stored as a vector of vectors) which I have to access, whose sizes in each dimension are very different. I have a lot of steps where I loop through the larger dimension and perform operations across the smaller dimension. I'm wondering from the view point of efficiency with respect to access time and operations on this matrix, which of the following two examples would be more efficient:
Organization 1:
A=vector<vector<float>>(1000,vector<float>(10,0.0));
sumAcrossSmallerDimension=vector<float>(1000,0.0);
for(int i=0;i<1000;i++)
for(int j=0;j<10;j++)
sumAcrossSmallerDimension[i]+=A[i][j];
Organization 2:
A=vector<vector<float>>(10,vector<float>(1000,0.0));
sumAcrossSmallerDimension=vector<float>(1000,0.0);
for(int i=0;i<1000;i++)
for(int j=0;j<10;j++)
sumAcrossSmallerDimension[i]+=A[j][i];
In the second example it seems like each set A entries would get loaded faster, but in order to sum across the j Dimension, you'd be jumping in memory 10 times per i iteration to find the corresponding j entry.
In the first example, it seems like loading A would be slower but then all entries in the lower dimension are readily available to sum.
Curious about this, thank you for your help!
Upvotes: 1
Views: 90
Reputation: 69912
I think a linear address space rather than a vector of vectors will give you the best cache locality:
#include <memory>
#include <algorithm>
#include <utility>
#include <vector>
#include <numeric>
struct vv
{
vv(std::size_t rows, std::size_t columns, double init)
: _rows(rows), _columns(columns), _size(_rows * _columns)
, _pdata(std::make_unique<double[]>(_size))
{
std::fill(_pdata.get(), _pdata.get() + _size, init);
}
const double* operator[](std::size_t i) const {
return std::addressof(_pdata.get()[i * _columns]);
}
double rowSum(std::size_t i) const {
auto p = (*this)[i];
return std::accumulate(p, p + _columns, 0.0, std::plus<>());
}
std::size_t _rows, _columns, _size;
std::unique_ptr<double[]> _pdata;
};
int main()
{
vv v(1000, 10, 10.0);
auto sumAcrossSmallerDimension = std::vector<double>(1000,0.0);
for(std::size_t i = 0 ; i < 1000 ; ++i)
{
sumAcrossSmallerDimension[i] += v.rowSum(i);
}
}
Upvotes: 1