Reputation: 973

Most efficient way to grow array C++

Apologies if this has been asked before, I can't find a question that fully answers what I want to know. They mention ways to do this, but don't compare approaches.

I am writing a program in C++ to solve a PDE to steady state. I don't know how many time steps this will take. Therefore I don't know how long my time arrays will be. This will have a maximum time of 100,000s, but the time step could be as small as .001, so it could be as many as 1e8 doubles in length in the worst case (not necessarily a rare case either).

What is the most efficient way to implement this in terms of memory allocated and running time?

Options I've looked at:

Dynamically allocating an array with 1e8 elements, most of which won't ever be used.
Allocating a smaller array initially, creating a larger array when needed and copying elements over
Using std::vector and it's size increasing functionality

Are there any other options?

I'm primarily concerned with speed, but I want to know what memory considerations come into it as well

Upvotes: 3

Answers (3)

user4842163

Reputation:

As noted in the comments, if you are careful using vector, you can actually reserve the capacity to store the maximum input size in advance (1e8 doubles) without paging in any memory.

For this you want to avoid the fill constructor and methods like resize (which would end up accessing all the memory) and use reserve and push_back to fill it and only touch memory as needed. That will allow most operating systems to simply page in chunks of your accessed vector at a time instead of the entire contents all at once.

Yet I tend to avoid this solution for the most part at these kinds of input scales, but for simple reasons:

A possibly-paranoid portability fear that I may encounter an operating system which doesn't have this kind of page-on-demand behavior.
A possibly-paranoid fear that the allocation may fail to find a contiguous set of unused pages and face out of memory errors (this is a grey zone -- I tend to worry about this for arrays which span gigabytes, hundreds of megabytes is borderline).
Just a totally subjective and possibly dumb/old bias towards not leaning too heavily on the operating system's behavior for paging in allocated memory, and preferring to have a data structure which simply allocates on demand.
Debugging.

Among the four, the first two could simply be paranoia. The third might just be plain dumb. Yet at least on operating systems like Windows, when using a debug build, the memory is initialized in its entirety early, and we end up mapping the allocated pages to DRAM immediately on reserving capacity for such a vector. Then we might end up leading to a slight startup delay and a task manager showing 800 megabytes of memory usage for a debug build even before we've done anything.

While generally the efficiency of a debug build should be a minor concern, when the potential discrepancy between release and debug is enormous, it can start to render production code almost incapable of being effectively debugged. So when the differences are potentially vast like this, my preference is to "chunk it up".

The strategy I like to apply here is to allocate smaller chunks -- smaller arrays of N elements, where N might be, say, 512 doubles (just snug enough to fit a common denominator page size of 4 kilobytes -- possibly minus a couple of doubles for chunk metadata). We fill them up with elements, and when they get full, create another chunk.

With these chunks, we can aggregate them together by either linking them (forming an unrolled list) or storing a vector of pointers to them in a separate aggregate depending on whether random-access is needed or merely sequential access will suffice. For the random-access case, this incurs a slight overhead, yet one I've tended to find relatively small at these input scales which often have times dominated by the upper levels of the memory hierarchy rather than register and instruction level.

This might be overkill for your case and a careful use of vector may be the best bet. Yet if that doesn't suffice and you have similar concerns/needs as I do, this kind of chunky solution might help.

Upvotes: 1

Daniel

Reputation: 8431

The only way to know which option is 'most efficient' on your machine is to try a few different options and profile. I'd probably start with the following:

std::vector constructed with the maximum possible size.
std::vector constructed with a conservative ballpark size and push_back.
std::deque and push_back.

The std::vector vs std::deque debate is ongoing. In my experience, when the number of elements is unknown and not too large, std::deque is almost never faster than std::vector (even if the std::vector needs multiple reallocations) but may end up using less memory. When the number of elements is unknown and very large, std::deque memory consumption seems to explode and std::vector is the clear winner.

If after profiling, none of these options offers satisfactory performance, then you may want to consider writing a custom allocator.

Upvotes: 0

Sorin

Reputation: 11968

If you are concerned about speed just allocate 1e8 doubles and be done with it.

In most cases vector should work just fine. Remember that amortized it's O(1) for the append.

Unless you are running on something very weird the OS memory allocation should take care of most fragmentation issues and the fact that it's hard to find a 800MB free memory block.

Upvotes: 3

Most efficient way to grow array C++

Answers (3)

Related Questions