Why aree unordered_map and map giving the same performance?

Here is my code, my unordered_map and map are behaving the same and taking the same time to execute. Am I missing something about these data structures?

Update: I've changed my code based on the below answers and comments. I've removed string operation to reduce the impact in profiling. Also now I'm only measuring the find() which takes almost 40% of CPU in my code. The profile shows that unordered_map is 3 times faster, however, is there any other way to make this code faster?

#include <map>
#include <unordered_map>
#include <stdio.h>

struct Property {
    int a;
};

int main() {
    printf("Performance Summery:\n");
    static const unsigned long num_iter = 999999;
    
    std::unordered_map<int, Property > myumap;
    for (int i = 0; i < 10000; i++) {
        int ind = rand() % 1000;
        Property p;
        p.a = i;
        myumap.insert(std::pair<int, Property> (ind, p));
    }
    
    clock_t tStart = clock();
    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        std::unordered_map<int, Property >::iterator itr = myumap.find(ind);
    }
    
    printf("Time taken unordered_map: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
    
    std::map<int, Property > mymap;
    for (int i = 0; i < 10000; i++) {
        int ind = rand() % 1000;
        Property p;
        p.a = i;
        mymap.insert(std::pair<int, Property> (ind, p));
    }
    
    tStart = clock();
    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        std::map<int, Property >::iterator itr = mymap.find(ind);
    }
    
    printf("Time taken map: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
}

The output is here

Performance Summary:
Time taken unordered_map: 0.12s
Time taken map: 0.36s

Upvotes: 3

Answers (4)

Richard Hodges

Reputation: 69912

It's not just the lookup that's faster with an unordered_map. This slightly modified test also compares the fill times.

I have made a couple of modifications:

increased sample size
both maps now use the same sequence of random numbers.

#include <map>
#include <unordered_map>
#include <vector>
#include <stdio.h>

struct Property {
    int a;
};

struct make_property : std::vector<int>::const_iterator
{
    using base_class = std::vector<int>::const_iterator;
    using value_type = std::pair<const base_class::value_type, Property>;
    using base_class::base_class;

    decltype(auto) get() const {
        return base_class::operator*();
    }

    value_type operator*() const
    {
        return std::pair<const int, Property>(get(), Property());
    }
};

int main() {
    printf("Performance Summary:\n");
    static const unsigned long num_iter = 9999999;

    std::vector<int> keys;
    keys.reserve(num_iter);
    std::generate_n(std::back_inserter(keys), num_iter, [](){ return rand() / 10000; });


    auto time = [](const char* message, auto&& func)
    {
        clock_t tStart = clock();
        func();
        clock_t tEnd = clock();
        printf("%s: %.2gs\n", message, double(tEnd - tStart) / CLOCKS_PER_SEC);
    };

    std::unordered_map<int, Property > myumap;
    time("fill unordered map", [&]
    {
        myumap.insert (make_property(keys.cbegin()),
                       make_property(keys.cend()));
    });


    std::map<int, Property > mymap;
    time("fill ordered map",[&]
         {
             mymap.insert(make_property(keys.cbegin()),
                          make_property(keys.cend()));
         });

    time("find in unordered map",[&]
         {
             for (auto k : keys) { myumap.find(k); }
         });

    time("find in ordered map", [&]
         {
             for (auto k : keys) { mymap.find(k); }
         });
}

example output:

Performance Summary:
fill unordered map: 3.5s
fill ordered map: 7.1s
find in unordered map: 1.7s
find in ordered map: 5s

Upvotes: 1

rustyx

Reputation: 85521

Whether a tree (std::map) or a hash map (std::unordered_map) is faster really depends on the number of entries and the characteristics of the key (the variability of the values, the compare and hashing functions, etc.)

But in theory, a tree is slower than a hash map because insertion and searching inside a binary tree is O(log2(N)) complexity while insertion and searching inside a hash map is roughly O(1) complexity.

Your test didn't show it because:

You call rand() in a loop. That takes ages in comparison with the map insertion. And it generates different values for the two maps you're testing, skewing results even further. Use a lighter-weight generator e.g. a minstd LCG.
You need a higher resolution clock and more iterations so that each test run takes at least a few hundred milliseconds.
You need to make sure the compiler does not reorder your code so the timing calls happen where they should. This is not always easy. A memory fence around the timed test usually helps to solve this.
Your find() calls have a high probability of being optimized away since you're not using their value (I just happen to know that at least GCC in -O2 mode doesn't do that, so I leave it as is).
String concatenation is also very slow in comparison.

Here's my updated version:

#include <atomic>
#include <chrono>
#include <iostream>
#include <map>
#include <random>
#include <string>
#include <unordered_map>

using namespace std;
using namespace std::chrono;

struct Property {
  string fileName;
};

const int nIter = 1000000;

template<typename MAP_TYPE>
long testMap() {
  std::minstd_rand rnd(12345);
  std::uniform_int_distribution<int> testDist(0, 1000);
  auto tm1 = high_resolution_clock::now();
  atomic_thread_fence(memory_order_seq_cst);
  MAP_TYPE mymap;

  for (int i = 0; i < nIter; i++) {
    int ind = testDist(rnd);
    Property p;
    p.fileName = "hello" + to_string(i) + "world!";
    mymap.insert(pair<int, Property>(ind, p));
  }
  atomic_thread_fence(memory_order_seq_cst);

  for (int i = 0; i < nIter; i++) {
    int ind = testDist(rnd);
    mymap.find(ind);
  }

  atomic_thread_fence(memory_order_seq_cst);
  auto tm2 = high_resolution_clock::now();
  return (long)duration_cast<milliseconds>(tm2 - tm1).count();
}

int main()
{
  printf("Performance Summary:\n");
  printf("Time taken unordered_map: %ldms\n", testMap<unordered_map<int, Property>>());
  printf("Time taken map: %ldms\n", testMap<map<int, Property>>());
}

Compiled with -O2, it gives the following results:

Performance Summary:
Time taken unordered_map: 348ms
Time taken map: 450ms

So using unordered_map in this particular case is faster by ~20-25%.

Upvotes: 2

Shachar Shemesh

Reputation: 8603

There is a reason we like getting minimal, complete and verifiable examples. Here's my code:

#include <map>
#include <unordered_map>
#include <stdio.h>

struct Property {
    int a;
};

static const unsigned long num_iter = 100000;
int main() {
    printf("Performance Summery:\n");
    clock_t tStart = clock();
    std::unordered_map<int, Property> myumap;

    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        Property p;
        //p.fileName = "hello" + to_string(i) + "world!";
        p.a = i;
        myumap.insert(std::pair<int, Property> (ind, p));
    }

    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        myumap.find(ind);
    }

    printf("Time taken unordered_map: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

    tStart = clock();
    std::map<int, Property> mymap;

    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        Property p;
        //p.fileName = "hello" + to_string(i) + "world!";
        p.a = i;
        mymap.insert(std::pair<int, Property> (ind, p));
    }

    for (int i = 0; i < num_iter; i++) {
        int ind = rand() % 1000;
        mymap.find(ind);
    }

    printf("Time taken map: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
}

Run time is:

Performance Summery:
Time taken unordered_map: 0.04s
Time taken map: 0.07s

Please note that I am running 10 times the number of iterations you were running.

I suspect there are two problems with your version. The first is that you are running too little iterations for it to make a difference. The second is that you are doing expensive string operations inside the counted loop. The time it takes to run the string operations is greater than the time saved by using unordered map, hence you not seeing the difference in performance.

Upvotes: 5

Phil Wallach

Reputation: 3308

Without going into your code, I would make a few general comments.

What exactly are you measuring? Your profiling includes both populating and scanning the data structures. Given that (presumably) populating an ordered map would take longer, measuring both works against the idea of the gains (or otherwise) of an ordered map. Figure out what you are measuring and just measure that.
You also have a lot going on in the code that is probably incidental to what you are profiling: there is a lot of object creation, string concatenation, etc etc. This is probably what you are actually measuring. Focus on profiling only what your want to measure (see point 1).
10,000 cases is way too small. At this scale other considerations can overwhelm what you are measuring, particularly when you are measuring everything.

Upvotes: 6

Why aree unordered_map and map giving the same performance?

Answers (4)

Related Questions