Reputation: 7349

Best practices for variable scopes with vectors

My understanding is that C++ best practice is to define variables with the smallest scope possible.

My understanding is that the primary reason for this is that it will help prevent accidental reuse. In addition, there almost never a performance hit for doing so (or so I have been told). To the contrary, people seem to indicate that a compiler may actually be able to produce equivalent or better code when variables are defined locally. For example, the following two functions produce the same binaries in Godbolt:

#include <cstdio>
#include <cstdlib>

void printRand1() {
    char val;
    for (size_t i = 0; i < 100 ; ++i) {
        val = rand();
        puts(&val);
    }
}

void printRand2() {
    for (size_t i = 0; i < 100 ; ++i) {
        const char val = rand();
        puts(&val);
    }
}

So in this case, version 2 is clearly preferable. This context I can totally agree with and understand.

What is not clear to me is whether the same logic should be applied to larger data types such as arrays or vectors. One particular thing that I find a lot in code is something like this:

#include <cstdio>
#include <cstdlib>
#include <vector>

struct Bob {

    std::vector<char> buffer;

    void bar(int N) {
        buffer.resize(N);
        for (auto & elem : buffer) {
            elem = rand();
            puts(&elem);
        }
    }
};

void bob() {
    Bob obj;
    obj.bar(100);
}

despite the fact that we could have localized the data better in this dumb example:

#include <cstdio>
#include <cstdlib>
#include <vector>

struct Bob {

    void bar(int N) {
        std::vector<char> buffer(N);
        for (auto & elem : buffer) {
            elem = rand();
            puts(&elem);
        }
    }
};

void bob() {
    Bob obj;
    obj.bar(100);
}

Note: Before you guys jump on this, I totally realize that you don't actually need a vector in this example. I am just making a stupid example so that the binary code is not too large on Godbolt.

The rationale here for NOT localizing the data (aka. Snippet 1) is that the buffer could be some large vector, and we don't want to keep reallocating it every time we call the function.

The rationale for Snippet 2 is to localize the data better.

So what logic should I apply for this scenario? I am interested in the case where you actually need a vector (in this case you don't).

Should I follow the localization logic? or should I ride by the logic that I should try to prevent repeated reallocations?

I realize that in an actual application, you would want to benchmark the performance, not the compiled size in Godbolt. But I wonder what should be my default style for this scenario (before I start profiling the code).

Upvotes: 0

Answers (2)

einpoklum

Reputation: 131626

The main consideration in the scenario you describe is: "Is the buffer an integral part of what a Bob is? Or is it just something we use in the implementation of bar()?"

If every Bob has a sequence of contiguous chars throughout its life as a Bob, then - that should be a member variable. If you only form that sequence to run bar(), then by the "smallest relevant scope" rule, that vector will only exist as a local variable inside bar().

Now, the above is the general-case answer. Sometimes, for reasons of performance, you may end up breaking your clean and reasonable abstractions. For example: You might have some single vector allocated and just it associated with a Bob for a period of time, then dissociate the buffer from your Bob but keep it in some buffer cache. But don't think about these kinds of contortions unless you have a very good reason to.

Upvotes: 2

nluk

Reputation: 754

In version 2, memory will be allocated and deallocated with each bar() call. While version one will reuse already allocated chunk. For single call it doesn't matter, for multiple - version 1 would be preferred.

Upvotes: 1

Best practices for variable scopes with vectors

Answers (2)

Related Questions