vsoftco
vsoftco

Reputation: 56577

Cost of std::vector::push_back either succeeding or having no effect?

If I understand correctly, std::vector::insert does not guarantee a commit-or-rollback for std::vectors (it does for std::lists for obvious reasons) in the case an exception is thrown during copying or moving, because of the high cost of checking for exceptions. I recently saw that push_back DOES guarantee either successful insertion at the end or nothing happens.

My question is the following. Suppose that during a push_pack the vector has to be resized (reallocation happens). In that case, all elements have to be copied in the new container, via copy or move semantics. Assuming that the move constructor is not guaranteed noexcept, std::vector will then use copy semantics. So, to guarantee the above behaviour of push_pack, std::vector has to check for successful copying, and if not, roll back via a swap the initial vector. Is this what is happening, and if so, isn't this expensive? Or, because reallocation happens rarely, one can say that the amortized cost is low?

Upvotes: 14

Views: 2892

Answers (3)

Howard Hinnant
Howard Hinnant

Reputation: 219578

In C++98/03 we (obviously) had no move semantics, only copy semantics. And in C++98/03, push_back has the strong guarantee. One of the strong motivations in C++11 was to not break existing code that relied on this strong guarantee.

The C++11 rules are:

  1. If is_nothrow_move_constructible<T>::value is true, move, else
  2. If is_copy_constructible<T>::value is true, copy, else
  3. If is_move_constructible<T>::value is true, move, else
  4. The code is ill-formed.

If we are in the case of 1 or 2, we have the strong guarantee. If we are in case 3, we have only the basic guarantee. Since in C++98/03, we were always in case 2, we have backwards compatibility.

Case 2 is not expensive to maintain the strong guarantee. One allocates the new buffer, preferably using an RAII device such as a second vector. Copies into it, and only if all of that succeeds, swap *this with the RAII device. This is the cheapest way to do things, whether or not you want the strong guarantee, and you get it for free.

Case 1 is also inexpensive to maintain the strong guarantee. The best way I know of is to first copy/move the new element into the middle of the new allocation. If that succeeds, then move the elements from the old buffer and swap.

More Detail Than You Probably Want

libc++ accomplishes all 3 cases with the same algorithm. To accomplish this, two tools are used:

  1. std::move_if_noexcept
  2. A non-std vector-like container where the data is contiguous, but can start at a non-zero offset from the beginning of the allocated buffer. libc++ calls this thing split_buffer.

Assuming the reallocation case (the non-reallocation case is trivial), the split_buffer is constructed with a reference to this vector's allocator, and with twice the capacity of this vector, and with its starting position set to this->size() (though the split_buffer is still empty()).

Then the new element is copied or moved (depending on which push_back overload we are talking about) to the split_buffer. If this fails, the split_buffer destructor undoes the allocation. If it succeeds, then the split_buffer now has size() == 1, and the split_buffer has room for exactly this->size() elements prior to its first element.

Next the elements are moved/copied in reverse order from this to the split_buffer. move_if_noexcept is used for this, which has a return type of either T const& or T&& exactly as we need as specified by the 3 cases above. On each successful move/copy, the split_buffer is doing a push_front. If successful, the split_buffer now has size() == this->size()+1, and its first element is at a zero offset from the beginning of its allocated buffer. If any move/copy fails, split_buffer's destructor destructs whatever is in the split_buffer and deallocates the buffer.

Next the split_buffer and this swap their data buffers. This is a noexcept operation.

Finally the split_buffer destructs, destructing all of its elements, and deallocating its data buffer.

No try-catches needed. There is no extra expense. And everything works as specified by C++11 (and summarized above).

Upvotes: 22

milleniumbug
milleniumbug

Reputation: 15834

Reallocations are expensive, yes. That's why the cost of reallocation is amortized over the many calls of std::vector::push_back.

Most commonly it's done by allocating new block of size given by multiplying old size by some constant factor (for example 1.5 or 2) - like 1,2,4,8,16,32,...

When reallocating, std::vector allocates new block, copies the elements, destroys the old elements and deallocates the old block. In case of failure (thrown exception) we can simply abort the process of copying and start destroying the copied elements - so the cost in the worst case is 2*(n-1)+d, where d is the cost of deallocation (we copy n-1 elements, when copying the n-th element, exception is thrown, so we destroy n-1 elements, and deallocate the new block)

Note the above applies when moves are not noexcept. If moves are noexcept, the only possible point of failure is allocation of the new memory block (since destructors are required to be noexcept), and the reallocation process is simpler and faster.

You can reduce the performance impact of reallocations by using different container (like std::deque, where the issue does not exist) or by using std::vector::reserve beforehand (this requires knowing an estimate for the element count).

Upvotes: 3

nwp
nwp

Reputation: 10021

That is what is happening and it can be expensive. It was decided that the strong guarantee of push_back is more important than the performance from move semantics.

Upvotes: 3

Related Questions