Reputation: 590

why std::vector item deletion does not reduce its capacity?

I am aware that when we insert items to a vector its capacity could be increase by non-linear factor. In gcc its capacity doubles. But I wonder why when I erase items from a vector, the capacity does not reduce. I tried to find out a reason for this. It 'seems' C++ standard does not say any word about this reduction (either to do or not).

For my understand ideally, when vector size comes to 1/4 of its capacity at item deletion, it the vector could be shrunken by 1/2 of its capacity to achieve constant amortized space allocation/de-allocation complexity.

My question is why C++ standard does not specify capacity reduction policy? What are the language design goals to not to specify anything about this? Does anyone has an idea about this?

Upvotes: 2

Answers (3)

David K

Reputation: 3132

The algorithm for allocating additional space as the vector grows has "constant amortized complexity" due to the notion that the total complexity (which is O(N) when a vector of N elements is created by a series of push_back() operations) can be "amortized" over the N push_back() calls--that is, the total cost is divided by N.

Even more specifically, using the algorithm that allocates twice as much space each time, the worst case is that the algorithm allocates nearly 4 times as much memory as would need to be allocated if you knew the exact size of the vector in advance. The last allocation is just slightly less than two times the size of the vector after the allocation, and the some of all the previous allocations is slightly less than the size of the last allocation. The total number of allocations is O(log N), and the number of deallocations (up to that point) is just one less than the number of allocations.

For a large vector, if you know its maximum size in advance, it's more efficient to reserve that space at the beginning (one allocation rather than O(log N) allocations) before inserting any data.

If you cut the capacity in half each time the size of the vector shrank to 1/4 of the currently-allocated space--that is, if you ran the allocation algorithm in reverse--you would be re-allocating (and then deallocating) nearly as much memory as the maximum capacity of the vector, in addition to deallocating the memory block with the maximum capacity. That's a performance penalty for applications that simply wanted to erase elements of the vector until they were all gone and then delete the vector.

That is, with deallocation as well as allocation, it's better to do it all at once if you can. And with deallocation you (almost) always can.

The only beneficiary of the more complicated deallocation algorithm would be an application that makes a vector, then erases at least 3/4 of it and then keeps the remaining part in memory while proceeding to grow new vectors. And even then there would be no benefit from the complicated algorithm unless the sum of the maximum capacities of the old (but still existing) vectors and the new vectors was so large that the application started to run into limitations of virtual memory.

Why penalize all algorithms that progressively erase their vectors in order to gain this advantage in this special case?

Upvotes: 0

Ben Voigt

Reputation: 283893

Even more than the performance of moving all elements is the effect on existing iterators and pointers to elements. The behavior of erase is:

Invalidates iterators and references at or after the point of the erase.

If reallocation occurred, then all iterators, pointers, and references would become invalid. In general, keeping iterator validity is a desirable thing.

Upvotes: 2

Praetorian

Reputation: 109289

It 'seems' C++ standard does not say any word about this reduction (either to do or not)

This is not true, because the complexity description for vector::erase specifies exactly what operations will be performed.

From §23.3.6.5/4 [vector.modifiers]

 iterator erase(const_iterator position);
 iterator erase(const_iterator first, const_iterator last);
Complexity: The destructor of T is called the number of times equal to the number of the elements erased, but the move assignment operator of T is called the number of times equal to the number of elements in the vector after the erased elements.

This precludes implementations from reducing capacity because that would mean reallocation of storage and moving all remaining elements to the new memory.

And if you're asking why the standard itself doesn't specify implementations are allowed to reduce capacity when you erase elements, then one can only guess the reasons.

It was probably considered not important enough from a performance point of view to have the vector spend time reallocating and moving elements when erasing
Reducing capacity would also add an additional possibility of an exception due to a failed memory allocation.

You can attempt to reduce capacity yourself by calling vector::shrink_to_fit, but be aware that this call is non-binding, and implementations are allowed to ignore it.

Another possibility for reducing the capacity would be move the elements into a temporary vector and swap it back into the original.

decltype(vec)(std::make_move_iterator(vec.begin()), 
              std::make_move_iterator(vec.end())).swap(vec);

But even with the second method, there's nothing stopping an implementation from over allocating storage.

Upvotes: 4

why std::vector item deletion does not reduce its capacity?

Answers (3)

Related Questions