Reputation: 3274
I'm hesitating on how to organize the memory layout of my 2D data.
Basically, what I want is an N*M 2D double
array, where N ~ M are in the thousands (and are derived from user-supplied data)
The way I see it, I have 2 choices :
double *data = new double[N*M];
or
double **data = new double*[N];
for (size_t i = 0; i < N; ++i)
data[i] = new double[M];
The first choice is what I'm leaning to. The main advantages I see are shorter new/delete syntax, continuous memory layout implies adjacent memory access at runtime if I arrange my access correctly, and possibly better performance for vectorized code (auto-vectorized or use of vector libraries such as vDSP or vecLib)
On the other hand, it seems to me that allocating a big chunk of continuous memory could fail/take more time compared to allocating a bunch of smaller ones. And the second method also has the advantage of the shorter syntax data[i][j]
compared to data[i*M+j]
What would be the most common / better way to do this, mainly if I try to view it from a performance standpoint (even though those are gonna be small improvements, I'm curious to see which would more performing).
Upvotes: 1
Views: 264
Reputation: 70412
If N
and M
are constants, it is better to just statically declare the memory you need as a two dimensional array. Or, you could use std::array
.
std::array<std::array<double, M>, N> data;
If only M
is a constant, you could use a std::vector
of std::array
instead.
std::vector<std::array<double, M>> data(N);
If M
is not constant, you need to perform some dynamic allocation. But, std::vector
can be used to manage that memory for you, so you can create a simple wrapper around it. The wrapper below returns a row
intermediate object to allow the second []
operator to actually compute the offset into the vector
.
template <typename T>
class matrix {
const size_t N;
const size_t M;
std::vector<T> v_;
struct row {
matrix &m_;
const size_t r_;
row (matrix &m, size_t r) : m_(m), r_(r) {}
T & operator [] (size_t c) { return m_.v_[r_ * m_.M + c]; }
T operator [] (size_t c) const { return m_.v_[r_ * m_.M + c]; }
};
public:
matrix (size_t n, size_t m) : N(n), M(m), v_(N*M) {}
row operator [] (size_t r) { return row(*this, r); }
const row & operator [] (size_t r) const { return row(*this, r); }
};
matrix<double> data(10,20);
data[1][2] = .5;
std::cout << data[1][2] << '\n';
In addressing your particular concern about performance: Your rationale for wanting a single memory access is correct. You should want to avoid doing new
and delete
yourself, however (which is something this wrapper provides), and if the data is more naturally interpreted as multi-dimensional, then showing that in the code will make the code easier to read as well.
Multiple allocations as shown in your second technique is inferior because it will take more time, but its advantage is that it may succeed more often if your system is fragmented (the free memory consists of smaller holes, and you do not have a free chunk of memory large enough to satisfy the single allocation request). But multiple allocations has another downside in that some more memory is needed to allocate space for the pointers to each row.
My suggestion provides the single allocation technique without needed to explicitly call new
and delete
, as the memory is managed by vector
. At the same time, it allows the data to be addressed with the 2-dimensional syntax [x][y]
. So it provides all the benefits of a single allocation with all the benefits of the multi-allocation, provided you have enough memory to fulfill the allocation request.
Upvotes: 2
Reputation: 3069
Consider using something like the following:
// array of pointers to doubles to point the beginning of rows
double ** data = new double*[N];
// allocate so many doubles to the first row, that it is long enough to feed them all
data[0] = new double[N * M];
// distribute pointers to individual rows as well
for (size_t i = 1; i < N; i++)
data[i] = data[0] + i * M;
I'm not sure if this is a general practice or not, I just came up with this. Some downs still apply to this approach, but I think it eliminates most of them, like being able to access the individual doubles like data[i][j]
and all.
Upvotes: 0
Reputation: 75565
Between the first two choices, for reasonable values of M
and N
, I would almost certainly go with choice 1. You skip a pointer dereference, and you get nice caching if you access data in the right order.
In terms of your concerns about size, we can do some back-of-the-envelope calculations.
Since M
and N
are in the thousands, suppose each is 10000
as an upper bound. Then your total memory consumed is
10000 * 10000 * sizeof(double) = 8 * 10^8
This is roughly 800 MB, which while large, is quite reasonable given the size of memory in modern day machines.
Upvotes: 2