Zebrafish
Zebrafish

Reputation: 13886

Can you prevent std::vector bounds-checking on push_back()?

I know there are ways to either do or not do bounds-checking when indexing a vector, but specifically on push_back(), even if I know the capacity of the vector is large enough (i.e., I reserved enough), and I ran a loop pushing back elements into it, I'd assume that since it's dynamically resizing it would always have to do bounds-checking (or size checking) on every push_back.

If this is the case, then I was thinking something like a fast_push() would be useful if you can know you won't exceed the capacity.

I've heard claims of some vector libraries being faster (example), but I didn't see specifically the issue of the push_back() when knowing bounds checking won't be needed. The one in the link makes claims of up to 50% speed improvements. One of them being preventing value initialisation on construction when you don't need it, and some other things.

Upvotes: 8

Views: 930

Answers (1)

VasiaPupkin
VasiaPupkin

Reputation: 19

I think you cannot do that, because bounds-checking for push_back() is more or less defined by the C++ standard:

cpp reference link

If after the operation the new size() is greater than old capacity() a reallocation takes place, in which case all iterators (including the end() iterator) and all references to the elements are invalidated. Otherwise only the end() iterator is invalidated.

C++ 17 standard draft link

Remarks: Causes reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. If no reallocation happens, all the iterators and references before the insertion point remain valid. If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects.

However, you can use data() method to directly access underlying array, but in this case the size of the vector will remain unchanged. At this point it would be better to use std::array or a raw C array, because such usage of std::vector defeats its purpose: after any push_back() or resize() all data which comes after size() will be lost, so basically you will have to either never change size of vector (in which case why not use a raw C array?) or manually check size of array and resize if needed.

#include <vector>
#include <iostream>
    
int main(int argc, const char** argv) {
    std::vector<int> v;
    v.reserve(5);
    v.data()[0] = 3;
    v.data()[1] = 4;
    std::cout << v.data()[0] << '\n';
    std::cout << v[1] << '\n'; // you shouldn't do that, that's undefined behaviour
}

In the case of push_back(), bounds-checking is unavoidable because you need to change value of size() to check whether its value is less than capacity() and resize if so. It most probably isn't a bottleneck, since it's just a comparison of two numbers, which is a fairly fast operation.

Here is an example of MSVC source code and disassembly:

Source code (pay attention to _Mylast != _My_data._Myend part):

auto& _My_data   = _Mypair._Myval2;
pointer& _Mylast = _My_data._Mylast;
    
if (_Mylast != _My_data._Myend) {
    return _Emplace_back_with_unused_capacity(_STD forward<_Valty>(_Val)...);
}

and part of its disassembly:

mov         dword ptr [rsp+38h],5  
mov         rdx,qword ptr [rsp+28h]  
cmp         rdx,qword ptr [rsp+30h]  
je          main+0ADh (07FF7D092124Dh) 

In case of MSVC it's just one single je instruction to compare pointer of newly added element and end(), and if you are really searching for any performance improvements you should look somewhere else).

Upvotes: -1

Related Questions