Gal Goldman
Gal Goldman

Reputation: 8869

What's faster, iterating an STL vector with vector::iterator or with at()?

In terms of performance, what would work faster? Is there a difference? Is it platform dependent?

//1. Using vector<string>::iterator:
vector<string> vs = GetVector();

for(vector<string>::iterator it = vs.begin(); it != vs.end(); ++it)
{
   *it = "Am I faster?";
}

//2. Using size_t index:
for(size_t i = 0; i < vs.size(); ++i)
{
   //One option:
   vs.at(i) = "Am I faster?";
   //Another option:
   vs[i] = "Am I faster?";
}

Upvotes: 67

Views: 88089

Answers (17)

dtech
dtech

Reputation: 49289

Instead of focusing on particular generated code, I'd like to outline the more abstract difference between the indexing methods in order to explain the actual complexity involved:

at(): 9158ms
operator[]: 4269ms
iterator: 3914ms

An array is essentially an origin in memory. Indexing an array involves adding the size of n elements to the origin pointer. We work with logical indices in this context, meaning that each index is actually scaled by the size of the data type by the compiler for us.

So, to get the an element of an array, the machine's way to do that would be origin + (index * sizeOfElement).

Most arrays nowadays are implemented as 3 pointers, start, finish and current position, so some operations like size or capacity actually involve additional operations. Some of those operations may even involve division, which may be quite slow on some platforms. This may not be as bad if the size of the element happens to be a power of two value, because in this case the compiler optimizes potentially expensive mul and div operations to efficient bit shift operations.

In light of that, it becomes understandable why at() would be the worst possible option, as it does involve a significant amount of additional operations to ensure the bounds are correct on every indexing. In the other instances, you are delegating the bounds keeping, whenever practically applicable, to an existing and coarser and thus cheaper method.

[] is significantly better, we are no longer bothering with the bounds check and its computational overhead, we are only origin + (index * sizeOfElement) blindly, trusting some established constraint to keep it within bounds.

And finally, the iterator is the best performing option for the simple reason we can now avoid the index scaling. So we are reducing the origin + (index * sizeOfElement) to a current + sizeOfElement, we are avoiding the index scaling and reducing the overhead to a single addition, which in 99.9% of the cases is a single cpu clock overhead, and the loop can be constrained to a single integer comparison and disregard item sizes and indices. 1 dynamic pointer, 1 static size, 1 clock cycle increment and 1 clock cycle loop constraint, it doesn't get any more barebone to run a loop, minimal instruction and data footprint.

The one downside of iterators is you don't have the auxiliary index you may need for some other reason, and there's a certain cost of deriving it from the iterator, but it still pays to have the iterator be your default loop, as an additional index can easily be added and updated contextually and only pay for it when you actually need it, and it is the same cost as [] indexing, only optional rather than by default.

In practice, the compiler may understand your intent to a varying degree and apply a considerable amount of optimizations. For example, the it may be able to detect that your loop is constrained by the actual size of the array, and that the array length remains constant, so it may omit the bounds checking altogether. If your loop size is a constant, it may get rolled out or even auto vectorized depending on the data types and operations and so on...

Upvotes: -1

tstenner
tstenner

Reputation: 10281

Using an iterator results in incrementing a pointer (for incrementing) and for dereferencing into dereferencing a pointer.
With an index, incrementing should be equally fast, but looking up an element involves an addition (data pointer+index) and dereferencing that pointer, but the difference should be marginal.
at() also checks if the index is within the bounds, so it could be slower.

Benchmark results for 500M iterations, vector size 10, with gcc 4.3.3 (-O3), linux 2.6.29.1 x86_64:
at(): 9158ms
operator[]: 4269ms
iterator: 3914ms

YMMV, but if using an index makes the code more readable/understandable, you should do it.

2021 update

With modern compilers, all options are practically free, but iterators are very slightly better for iterating and easier to use with range-for loops (for(auto& x: vs)).

Code:

#include <vector>

void iter(std::vector<int> &vs) {
    for(std::vector<int>::iterator it = vs.begin(); it != vs.end(); ++it)
        *it = 5;
}

void index(std::vector<int> &vs) {
    for(std::size_t i = 0; i < vs.size(); ++i)
        vs[i] = 5;
}

void at(std::vector<int> &vs) {
    for(std::size_t i = 0; i < vs.size(); ++i)
        vs.at(i) = 5;
}

The generated assembly for index() and at() is identical ([godbolt])(https://godbolt.org/z/cv6Kv4b6f), but the loop setup for iter() is three instructions shorter:

iter(std::vector<int, std::allocator<int> >&):
        mov     rax, QWORD PTR [rdi]
        mov     rdx, QWORD PTR [rdi+8]
        cmp     rax, rdx
        je      .L1
.L3:                              ; loop body
        mov     DWORD PTR [rax], 5
        add     rax, 4
        cmp     rax, rdx
        jne     .L3
.L1:
        ret
index(std::vector<int, std::allocator<int> >&):
        mov     rax, QWORD PTR [rdi]
        mov     rdx, QWORD PTR [rdi+8]
        sub     rdx, rax
        mov     rcx, rdx
        shr     rcx, 2
        je      .L6
        add     rdx, rax
.L8:                              ; loop body
        mov     DWORD PTR [rax], 5
        add     rax, 4
        cmp     rdx, rax
        jne     .L8
.L6:
        ret

Upvotes: 46

jam spandex
jam spandex

Reputation: 31

Only slightly tangential to the original question, but the fastest loop would be

for( size_t i=size() ; i-- ; ) { ... }

which would of course count down. This does give a substantial saving if you have a large number of iterations in your loop, but it contains only a small number of very fast operations.

So with the [] operator access, this might be faster than many of the examples already posted.

Upvotes: 0

user541686
user541686

Reputation: 210402

It depends.

The answer is much more subtle than the existing answers show.

at is always slower than iterators or operator[].
But for operator[] vs. iterators, it depends on:

  1. How exactly you're using operator[].

  2. Whether your particular CPU has index registers (ESI/EDI on x86).

  3. How much other code also uses the same index passed to operator[].
    (e.g., are you indexing through multiple arrays in lockstep?)

Here's why:

  1. If you do something like

    std::vector<unsigned char> a, b;
    for (size_t i = 0; i < n; ++i)
    {
        a[13 * i] = b[37 * i];
    }
    

    Then this code will likely be much slower than the iterator version, since it performs a multiplication operation at each iteration of the loop!

    Similarly, if you do something like:

    struct T { unsigned char a[37]; };
    std::vector<T> a;
    for (size_t i = 0; i < n; ++i)
    {
        a[i] = foo(i);
    }
    

    Then this will probably also be slower than the iterator version, because sizeof(T) is not a power of 2, and therefore you are (again) multiplying by 37 each time you loop!

  2. If your CPU has index registers, then your code can perform as well or even better with indices rather than with iterators, if using the index register frees up another register for use in the loop. This is not something you can tell just by looking; you'd have to profile the code and/or disassemble it.

  3. If multiple arrays can share the same index, then the code only has to increment one index instead of incrementing multiple iterators, which reduces writes to memory and thus generally increases performance. However, if you're only iterating over a single array, then an iterator may very well be faster, since it avoids the need to add an offset to an existing base pointer.

In general, you should prefer iterators to indices, and indices to pointers, until and unless you face a bottleneck that profiling shows it will be beneficial to switch, because iterators are general-purpose and already likely to be the fastest approach; they don't require the data to be randomly-addressable, which allows you to swap containers if necessary. Indices are the next preferred tool, as they still don't require direct access to the data -- they are invalidated less frequently, and you can e.g. substitute a deque for a vector without any problems. Pointers should be a last resort, and they will only prove beneficial if iterators aren't already degenerating to potiners in release mode.

Upvotes: 3

ithenoob
ithenoob

Reputation: 326

Here's a code I wrote, compiled in Code::Blocks v12.11, using the default mingw compiler. This creates a huge vector, then accesses each element by using iterators, at(), and index. Each is looped once by calling the last element by function, and once by saving the last element to temporary memory.

Timing is done using GetTickCount.

#include <iostream>
#include <windows.h>
#include <vector>
using namespace std;

int main()
{
    cout << "~~ Vector access speed test ~~" << endl << endl;
    cout << "~ Initialization ~" << endl;
    long long t;
    int a;
    vector <int> test (0);
    for (int i = 0; i < 100000000; i++)
    {
        test.push_back(i);
    }
    cout << "~ Initialization complete ~" << endl << endl;


    cout << "     iterator test: ";
    t = GetTickCount();
    for (vector<int>::iterator it = test.begin(); it < test.end(); it++)
    {
        a = *it;
    }
    cout << GetTickCount() - t << endl;



    cout << "Optimised iterator: ";
    t=GetTickCount();
    vector<int>::iterator endofv = test.end();
    for (vector<int>::iterator it = test.begin(); it < endofv; it++)
    {
        a = *it;
    }
    cout << GetTickCount() - t << endl;



    cout << "                At: ";
    t=GetTickCount();
    for (int i = 0; i < test.size(); i++)
    {
        a = test.at(i);
    }
    cout << GetTickCount() - t << endl;



    cout << "      Optimised at: ";
    t = GetTickCount();
    int endof = test.size();
    for (int i = 0; i < endof; i++)
    {
        a = test.at(i);
    }
    cout << GetTickCount() - t << endl;



    cout << "             Index: ";
    t=GetTickCount();
    for (int i = 0; i < test.size(); i++)
    {
        a = test[i];
    }
    cout << GetTickCount() - t << endl;



    cout << "   Optimised Index: ";
    t = GetTickCount();
    int endofvec = test.size();
    for (int i = 0; i < endofvec; i++)
    {
        a = test[i];
    }
    cout << GetTickCount() - t << endl;

    cin.ignore();
}

Based on this, I personally got that "optimised" versions are faster than "non-optimised" Iterators are slower than vector.at() which is slower than direct indices.

I suggest you compile and run the code for yourselves.

EDIT: This code was written back when I had less experience with C/C++. A further test case should be to use prefix increment operators instead of postfix. That should better the running time.

Upvotes: 0

adammonroe
adammonroe

Reputation: 86

It really depends on what you are doing, but if you have to keep re-declaring the iterator, Iterators become MARGINALLY SLOWER. In my tests, the fastest possible iteration would be to declare a simple * to your vectors array and Iterate through that.

for example:

Vector Iteration and pulling two functions per pass.

vector<MyTpe> avector(128);
vector<MyTpe>::iterator B=avector.begin();
vector<MyTpe>::iterator E=avector.end()-1;
for(int i=0; i<1024; ++i){
 B=avector.begin();
   while(B!=E)
   {
       float t=B->GetVal(Val1,12,Val2); float h=B->GetVal(Val1,12,Val2);
    ++B;
  }}

Vector Took 90 clicks (0.090000 seconds)

But if you did it with pointers...

for(int i=0; i<1024; ++i){
MyTpe *P=&(avector[0]);
   for(int i=0; i<avector.size(); ++i)
   {
   float t=P->GetVal(Val1,12,Val2); float h=P->GetVal(Val1,12,Val2);
   }}

Vector Took 18 clicks (0.018000 Seconds)

Which is roughly equivalent to...

MyTpe Array[128];
for(int i=0; i<1024; ++i)
{
   for(int p=0; p<128; ++p){
    float t=Array[p].GetVal(Val1, 12, Val2); float h=Array[p].GetVal(Val2,12,Val2);
    }}

Array Took 15 clicks (0.015000 seconds).

If you eliminate the call to avector.size(), the time becomes the same.

Finally, calling with [ ]

for(int i=0; i<1024; ++i){
   for(int i=0; i<avector.size(); ++i){
   float t=avector[i].GetVal(Val1,12,Val2); float h=avector[i].GetVal(Val1,12,Val2);
   }}

Vector Took 33 clicks (0.033000 seconds)

Timed with clock()

Upvotes: 3

anon
anon

Reputation:

Why not write a test and find out?

Edit: My bad - I thought I was timing the optimised version but wasn't. On my machine, compiled with g++ -O2, the iterator version is slightly slower than the operator[] version, but probably not significantly so.

#include <vector>
#include <iostream>
#include <ctime>
using namespace std;

int main() {
    const int BIG = 20000000;
    vector <int> v;
    for ( int i = 0; i < BIG; i++ ) {
        v.push_back( i );
    }

    int now = time(0);
    cout << "start" << endl;
    int n = 0;
    for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
        n += *it;
    }

    cout << time(0) - now << endl;
    now = time(0);
    for(size_t i = 0; i < v.size(); ++i) {
        n += v[i];
    }
    cout << time(0) - now << endl;

    return n != 0;
}

Upvotes: 31

Karthik
Karthik

Reputation: 143

I found this thread now when trying to optimize my OpenGL code and wanted to share my results even though the thread is old.

Background: I have 4 vectors, sizes ranging from 6 to 12. Write happens only once at the beginning of the code and read occurs for each of the elements in the vectors every 0.1 milliseconds

The following is the stripped down version of the code used first:

for(vector<T>::iterator it = someVector.begin(); it < someVector.end(); it++)
{
    T a = *it;

    // Various other operations
}

The frame rate using this method was about 7 frames per second (fps).

However, when I changed the code to the following, the frame rate almost doubled to 15fps.

for(size_t index = 0; index < someVector.size(); ++index)
{
    T a = someVector[index];

    // Various other operations
}

Upvotes: 1

Mostaaf
Mostaaf

Reputation: 21

You can use this test code and compare results! Dio it!

#include <vector> 
#include <iostream> 
#include <ctime> 
using namespace std;; 


struct AAA{
    int n;
    string str;
};
int main() { 
    const int BIG = 5000000; 
    vector <AAA> v; 
    for ( int i = 0; i < BIG; i++ ) { 
        AAA a = {i, "aaa"};
        v.push_back( a ); 
    } 

    clock_t now;
    cout << "start" << endl; 
    int n = 0; 
    now = clock(); 
    for(vector<AAA>::iterator it = v.begin(); it != v.end(); ++it) { 
        n += it->n; 
    } 
   cout << clock() - now << endl; 

    n = 0;
    now = clock(); 
    for(size_t i = 0; i < v.size(); ++i) { 
        n += v[i].n; 
    } 
    cout << clock() - now << endl; 

    getchar();
    return n != 0; 
} 

Upvotes: 2

Tobias
Tobias

Reputation: 6507

The difference should be negligible. std::vector guarantees that its elements are laid out consecutively in memory. Therefore, most stl implementations implement iterators into std::vector as a plain pointer. With this is mind, the only difference between the two versions should be that the first one increments a pointer, and in the second increments an index which is then added to a pointer. So my guess would be the second one is maybe one extremly fast (in terms of cycles) machine instruction more.

Try and check the machine code your compiler produces.

In general, however, the advice would be to profile if it really matters. Thinking about this kind of question prematurely usually does not give you too much back. Usually, your code's hotspots will be elsewhere where you might not suspect it at first sight.

Upvotes: 0

bernhardrusch
bernhardrusch

Reputation: 11890

I think the only answer could be a test on your platform. Generally the only thing which is standardized in the STL is the type of iterators a collection offers and the complexity of algorithms.

I would say that there is no (not much of a difference) between those two versions- the only difference I could think of would be tjat the code has to iterate through the whole collection when it has to compute the length of an array (I'm not sure if the length is stored in a variable inside the vector, then the overhead wouldn't matter)

Accessing the elements with "at" should take a little bit longer than directly accessing it with [] because it checks if you are in the bounds of the vector and throws an exception if you are out of bounds (it seems [] is normally just using pointer arithmetic - so it should be faster)

Upvotes: 1

Mats Fredriksson
Mats Fredriksson

Reputation: 20101

As everyone else here is saying, do benchmarks.

Having said that, I would argue that the iterator is faster since at() does range checking as well, i.e. it throws an out_of_range exception if the index is out of bounds. That check itself propbably incurrs some overhead.

Upvotes: 6

Stephen Nutt
Stephen Nutt

Reputation: 3306

If you are using VisualStudio 2005 or 2008, to get the best performance out of the vector you'll need to define _SECURE_SCL=0

By default _SECURE_SCL is on which makes iterating over a contain significantly slower. That said leave it on in debug builds, it will make tracking down any errors much easier. One word of caution, since the macro changes the size of iterators and containers, you'll have to be consistent across all compilation units that share a stl container.

Upvotes: 1

Zorglub
Zorglub

Reputation: 2147

The first one will be faster in debug mode because index access creates iterators behind the scene, but in release mode where everything should be inlined, the difference should be negligible or null

Upvotes: 2

xtofl
xtofl

Reputation: 41509

If you don't need indexing, don't use it. The iterator concept is there for your best. Iterators are very easy to optimize, while direct access needs some extra knowledge.

Indexing is meant for direct access. The brackets and the at method do this. at will, unlike [], check for out of bounds indexing, so it will be slower.

The credo is: don't ask for what you don't need. Then the compiler won't charge you for what you don't use.

Upvotes: 18

Brian R. Bondy
Brian R. Bondy

Reputation: 347216

I would guess the first variant is faster.

But it's implementation dependent. To be sure you should profile your own code.

Why profile your own code?

Because these factors will all vary the results:

  • Which OS
  • Which compiler
  • Which implementation of STL was being used
  • Were optimizations turned on?
  • ... (other factors)

Upvotes: 5

James Hopkin
James Hopkin

Reputation: 13973

Since you're looking at efficiency, you should realise that the following variations are potentially more efficient:

//1. Using vector<string>::iterator:

vector<string> vs = GetVector();
for(vector<string>::iterator it = vs.begin(), end = vs.end(); it != end; ++it)
{
   //...
}

//2. Using size_t index:

vector<string> vs = GetVector();
for(size_t i = 0, size = vs.size(); i != size; ++i)
{
   //...
}

since the end/size function is only called once rather than every time through the loop. It's likely that the compiler will inline these functions anyway, but this way makes sure.

Upvotes: 19

Related Questions