Juan JuezSarmiento
Juan JuezSarmiento

Reputation: 277

Reading with vector::data()[] seems faster than vector[]

I did this small console test using Visual Studio 2019 fully updated, and executing this in release mode using Ctrl-F5.

    // 1. I generate a random vector
    std::vector<int> vectorVal;
    for (int i=0; i!=32; ++i)
    {
        vectorVal.push_back(i-3);
    }

    // 2. Then I perform some reads
    int uselessValue = 0;
    auto start = high_resolution_clock::now(); // I take initial time
    for (int i = 0; i < 10000; ++i)
    {
        for (int j=0; j!=32; ++j)
        {
            // uselessValue += (vectorVal.data()[j]? +1 : -1); // Faster!!!!
            uselessValue += (vectorVal[j]? +1 : -1); // Slower
        }
    }
    auto end = high_resolution_clock::now(); // and end time

    // 3. Last, I show the time it needed to perform the operations
    std::cout<<"Vector "<<uselessValue<<": \t"<<duration_cast<nanoseconds>(end-start).count()<<" nanoseconds.\n"<<std::endl;

If you execute this, the result are around 80000 nanoseconds on my computer. However, if you use the "Faster" line instead the "Slower" one, the result is 100 nanoseconds.

I tried to debug this to see what's happening. Basically it should do the same, access to a basic C array, but it seems much faster with the vectorVal::data()[]. I'm quite sure I'm doing something wrong in my testing, but I fail to realize what's wrong.

Why does seem faster using vector::data()[] than vector[] to access its content?

Here's the full source code:

    const int num = 10000;

{
    std::vector<int> vectorVal;
    for (int i=0; i!=32; ++i)
    {
        vectorVal.push_back(i-3);
    }
    int uselessValue = 0;
    auto start = high_resolution_clock::now();
    int* data = vectorVal.data();
    for (int i = 0; i < num; ++i)
    {
        for (int j=0; j!=32; ++j)
        {
            uselessValue += (vectorVal.data()[j]? +1 : -1);
        }
    }
    auto end = high_resolution_clock::now();
    std::cout<<"Vector-data"<<uselessValue<<": \t"<<duration_cast<nanoseconds>(end-start).count()<<" nanoseconds.\n"<<std::endl;
}

{
    std::vector<int> vectorVal;
    for (int i=0; i!=32; ++i)
    {
        vectorVal.push_back(i-3);
    }


    int uselessValue = 0;
    auto start = high_resolution_clock::now();
    for (int i = 0; i < num; ++i)
    {
        for (int j=0; j!=32; ++j)
        {
            uselessValue += (vectorVal[j]? +1 : -1);
        }
    }
    auto end = high_resolution_clock::now(); 
    std::cout<<"Vector "<<uselessValue<<": \t"<<duration_cast<nanoseconds>(end-start).count()<<" nanoseconds.\n"<<std::endl;
}

I've uploaded the Visual Studio 2019 project solution here, you may download it and test directly. Maybe I'm failing with my settings.

Upvotes: 4

Views: 173

Answers (1)

John Zwinck
John Zwinck

Reputation: 249444

Your compiler must be set to debug mode, rather than release/optimized mode. The effect of this is that vector::data()[i] is faster than vector[i] because the latter will likely check i < vector::size() in a debug build for safety, while a release mode with optimization enabled will generate identical code for both.

Performance testing must always be done with optimized release builds, because that's the type of build you'd run whenever you care about performance in a real application.

Upvotes: 1

Related Questions