Ruggero Turra
Ruggero Turra

Reputation: 17740

Performance of containers inside loops

Do I have to care about code like this:

for (int i=0; i < BIG; ++i) {
  std::vector<int> v (10);
  ... do something with v ...
}

refactoring as:

std::vector<int> v;
for (int i=0; i < BIG; ++i) {
  v.clear(); v.resize(10); // or fill with zeros
  ... do something with v ...
}

or the compiler is smart enough to optimize memory (de)allocation?

I prefer the first one, since the std::vector<int> v is out of scope when I don't need it anymore.

Upvotes: 0

Views: 71

Answers (1)

Jerry Coffin
Jerry Coffin

Reputation: 490713

In practice it is difficult for a compiler to generate identical code for the two, unless the author of the standard library gives it some help.

In particular, vector is typically implemented vaguely like this:

template <class T, class Allocator=std::allocator<T>>
class vector {
    T *data;
    size_t allocated_size, current_size;
public:
    vector(size_t);
    // ...
};

I'm simplifying a lot, but that's enough to demonstrate the main point I'm trying to make here: the vector object itself does not contain the actual data. The vector object only contains a pointer to the data along with some metadata about allocation size and such.

That means each time you create a vector of 10 ints, vector's constructor has to use operator new to allocate space for (at least) 10 ints (well, technically, it uses the Allocator type that's passed to it to do that, but by default, that'll use the free store). Likewise, when it goes out of scope, it uses the allocator (again, defaulting to the free store) to destroy the 10 ints.

One obvious way to avoid that allocation would be to allocate space for at least a small number of elements inside the vector object itself, and only allocate space on the free store when/if the data grew larger than that space allowed. That way, creating your vector of 10 ints would be equivalent to just creating an array of 10 ints (either native array or std::array)--on a typical machine, it would be allocated on the stack when execution enters the enclosing function, and all that would happen on entry to the block would be initializing the contents (if you try to read some of it before writing to it).

At least in the general case, we can't do that though. For example, if I move assign a vector, that move assignment can't throw an exception--even if move assignment of the individual elements would throw an exception. As such, it can't do any operations on the individual elements. With a structure like above, that requirement is easy to meet--we basically do a shallow copy from the source to the destination, and zero out all the items in the source.

There is, however, a container in the standard library that does allow that optimization: std::basic_string specifically allows it. It might initially look a bit strange (and honestly, it is a bit strange), but if you replaced you std::vector with a std::basic_string<int> v(10, 0);, and used it on an implementation that includes the short-string optimization (e.g., VC++) you might get a substantial improvement in speed. One of the ways std::string is able to allow this is that you can't use it to store types that throw exceptions though--if int is just an example, and you might really need to store other types that can throw, then basic_string probably won't work for you. Even for native types like int, char_traits<T> may be an incomplete type, so this may not work anyway. If you decide you need to badly enough, you can use it as a container for your own types, by 1) ensuring they don't throw, and 2) specializing char_traits for your type. Bottom line: it can be interesting to experiment with this, but it's rarely (if ever) practical, and almost impossible to recommend.

The obvious alternative would be to use an std::array<int, 10> instead. If the size of the array is fixed, this is probably the preferred choice. Unlike instantiating basic_string over a non-character type, you'd be using this as intended, and getting its intended behavior. The weakness is that the size is a compile-time constant, so if you might ever need to change size at run-time, it's not an option at all.

Upvotes: 2

Related Questions