Mircode
Mircode

Reputation: 746

Lookup time for std::unordered_set not constant

For figuring out the best container type for my specific purpose, I was comparing lookup times of std::vector, sorted std::vector with binary search, std::set and std::unordered_set. Here are the results:

Lookup Time Comparison

I skipped std::vector lookup for more than 10'000 elements to keep the runtime finite. The containers are filled with all integers from 0 to n-1 and the lookup is performed for 100'000 random integers.

No big surprises about the vectors and the ordered set there, but what is happening with std::unordered_set? The lookup time is supposed to be constant there... Some fluctuation probably is to be expected due to rehashing at certain thresholds and such, but the rise after 300'000 elements looks very strange to me. Does somebody have an explanation?

The code was compiled and run on linux with GCC 11.3.0 in release mode (with O3 I think).

EDIT:

I have done more testing:

Lookup Comparison 2

Here I call rehash(n*4) after the construction of the unordered_set. So the average amount of collisions and the length of the lists in the buckets should be constant. Caching could be an explanation but then I would not expect the lookup time to continue increasing.

EDIT2:

I did more tests and also used a higher resolution now. First I tested if 10x as many buckets (but the same amount of elements) cause the bump to happen at 1/10th the number of elements:

10x as many buckets

But then I figured if linked lists are involved, it's probably more about the actual amount of stored data. So I did another test:

Here I have unordered_sets of arrays of ints. One int and ten ints. Hashing and operator== only use the first array element. If the slowdown is due to caching, my expectation would again be to see the bump for the 10 ints set at 1/10th of the number of elements compared to the 1 int set. But maybe it's not that simple. But I think it should be that simple...

lookup int1 and ints10

EDIT3:

Now I moved the rand call out of the time measurements, the speedup is much larger than I expected (I actually thought it would cause many more cache misses due to alternating lookups in the rand vector and the set but that's maybe not how caches work).

enter image description here

Here is the benchmark code that ChatGPT and I wrote:

#include <iostream>
#include <vector>
#include <set>
#include <unordered_set>
#include <algorithm>
#include <chrono>
#include <cstdlib>
#include <ctime>

// Custom binary search implementation
bool binarySearch(const std::vector<int>& vec, int target) {
    int left = 0, right = vec.size() - 1;
    while (left <= right) {
        int mid = left + (right - left) / 2;
        if (vec[mid] == target) {
            return true;
        } else if (vec[mid] < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }
    return false;
}

void testLookupTime(int n, int k) {
    std::cout << "[" << n << ", ";

    // Generate test data
    std::vector<int> testData(n);
    for (int i = 0; i < n; ++i) {
        testData[i] = i;
    }

    // Seed for random number generation
    std::srand(std::time(0));

    if(n < 1e4){
        // Test std::vector lookup time
        auto start = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < k; ++i) {
            int randomNum = std::rand() % (n + 1);
            volatile std::vector<int>::iterator result = std::find(testData.begin(), testData.end(), randomNum);
        }
        auto end = std::chrono::high_resolution_clock::now();
        std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << ", ";
    }
    else{
        std::cout << "0, ";
    }

    // Test sorted std::vector with binary search lookup time
    std::sort(testData.begin(), testData.end());
    auto start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < k; ++i) {
        int randomNum = std::rand() % (n + 1);
        volatile bool result = binarySearch(testData, randomNum);
    }
    auto end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << ", ";

    // Test std::set lookup time
    std::set<int> testSet(testData.begin(), testData.end());
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < k; ++i) {
        int randomNum = std::rand() % (n + 1);
        volatile std::set<int>::iterator result = testSet.find(randomNum);
    }
    end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << ", ";

    // Test std::unordered_set lookup time
    std::unordered_set<int> testUnorderedSet(testData.begin(), testData.end());
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < k; ++i) {
        int randomNum = std::rand() % (n + 1);
        volatile std::unordered_set<int>::iterator result = testUnorderedSet.find(randomNum);
    }
    end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << "],\n" << std::flush;
}

int main() {
    const int kTimes = 100000;

    for(float n=5; n<=1e8; n*=1.1){
        testLookupTime(n, kTimes);
    }

    return 0;
}

Upvotes: 0

Views: 215

Answers (2)

Miles Budnek
Miles Budnek

Reputation: 30639

You're actually measuring the performance of std::rand.

Here's a stripped-down version of your test that moves the random lookup generation out of the timing loop, and it does not show the same growth in lookup time:

#include <iostream>
#include <vector>
#include <unordered_set>
#include <chrono>
#include <cstdlib>
#include <ctime>

void testLookupTime(int n, int k) {
    std::cout << "[" << n << ", ";

    // Generate test data
    std::vector<int> testData(n);
    for (int i = 0; i < n; ++i) {
        testData[i] = i;
    }

    std::vector<int> lookups(k);
    for (int i = 0; i < k; ++i) {
        lookups[i] = std::rand() % (n + 1);
    }

    // Added just to make sure the unordered_set::find isn't getting optomized out
    auto start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < k; ++i) {
        // no-op
    }
    auto end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << ", ";

    // Test std::unordered_set lookup time
    std::unordered_set<int> testUnorderedSet(testData.begin(), testData.end());
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < k; ++i) {
        volatile std::unordered_set<int>::iterator result = testUnorderedSet.find(lookups[k]);
    }
    end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << "],\n" << std::flush;
}

int main() {
    // Seed for random number generation
    std::srand(std::time(0));    
    
    const int kTimes = 100000;

    for(float n=1e6; n<=1e7; n*=1.15){
        testLookupTime(n, kTimes);
    }

    return 0;
}

Demo

Note: I had to tweak the range of container sizes to get the program to finish within the allotted time on coliru, but it shows constant lookup times over the whole original range of inputs on my local machine.

Upvotes: 1

Alan Birtles
Alan Birtles

Reputation: 36488

https://en.cppreference.com/w/cpp/container/unordered_set/find states:

Constant on average, worst case linear in the size of the container

Finding the bucket containing an object should always be constant time (subject to CPU effects like caching etc.) but finding an object inside a bucket depends on the number of objects in that bucket.

As the container fills up and there are multiple objects in each bucket, lookup time will increase.

You could decrease the load factor if you want to maintain linear or near-linear lookup performance though this will decrease insert performance due to more frequent re-hashing: https://en.cppreference.com/w/cpp/container/unordered_set/max_load_factor

Upvotes: 0

Related Questions