ram
ram

Reputation: 73

Local vs member variable for a large automatically allocated array C++?

I have an operation that will repeatedly be called multiple times per second(may be ten thousands) of times which required use of a large 2D array. Each operation is independent of each other. Is there a performance difference between keeping it as a local variable vs a global variable? Does repeated allocation and deallocation of the 2D array incur a performance cost vs its advantages?

class ProcessData {
    void update(Data& data) {
       std::array<std::array<int, 10000>, 10000> matrix;
    }
}

Upvotes: 0

Views: 174

Answers (2)

Global variables are generally speaking completely unnecessary, so let's not even go there - anything you can do using them, can be done by passing around a reference to a context object when new objects are constructed.

Since the operations are mutually independent, you'll want to parallelize them, so you have only three choices that will perform well: a class member variable, a thread-local static variable, or an automatic variable. The array is 400MB in size (10e3^2*4=100e6*4), so it simply won't work as an automatic variable - you'll usually run out of stack.

Thus:

class ProcessData {
public:
  static constexpr int N = 10000;
  using Matrix = std::array<std::array<int, N>, N>;
  void update(Data &data) {
    thread_local static Matrix matrix;
    // ...
  }
};

The downside is that depending on the C++ runtime implementation, the matrix may be allocated on startup of each and every thread, and you may not wish that to be the case when 400MB is at stake.

Thus, you might wish to allocate it only on demand:

// .h
class ProcessData {
public:
  static constexpr int N = 10000;
  using Matrix = std::array<std::array<int, N>, N>;
private:
  thread_local static std::unique_ptr<Matrix> matrix;
public:
  void update(Data &data) {
    if (!matrix) matrix.reset(new Matrix);
    //...
  }
};

// .cpp
thread_local std::unique_ptr<ProcessData::Matrix> ProcessData::matrix;

The matrix will be deallocated whenever the thread ends (e.g. a worker thread in a thread pool), but can also be deallocated explicitly: matrix.reset();

Upvotes: 4

LLSv2.0
LLSv2.0

Reputation: 543

First, yes reallocating an array (especially one of that size) on every call to update() will incur a significant performance cost. To remedy this, you could simply change matrix to a static local variable. This way it is not deallocated at the end of each function call. This however will mean that matrix is shared by all instances of ProcessData.

If this is an issue, you could also simply make this a member variable of ProcessData. Then each instance will have it's own matrix

Upvotes: 0

Related Questions