I which situation will std::map be faster than sorted std::vector?

Question

I was using a map in some code to store ordered data. I found out that for huge maps, destruction could take a while. In this code I had, replacing map by vector reduced processing time by 10000...

Finally, I was so surprised that I decided to compare map performances with sorted vector or pair.

And I'm surprised because I could not find a situation where map was faster than a sorted vector of pair (filled randomly and later sorted)...there must be some situations where map is faster....else what's the point in providing this class?

Here is what I tested:

Test one, compare map filling and destroying vs vector filling, sorting (because I want a sorted container) and destroying:

#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{

    clock_t tStart = clock();

    {
        std::map myMap;
        for ( int i = 0; i != 10000000; ++i )
        {
            myMap[ ((float)std::rand()) / RAND_MAX ] = i;
        }
    }

    std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

    tStart = clock();

    {
        std::vector< std::pair > myVect;
        for ( int i = 0; i != 10000000; ++i )
        {
            myVect.push_back( std::make_pair( ((float)std::rand()) / RAND_MAX, i ) );
        }

        // sort the vector, as we want a sorted container:
        std::sort( myVect.begin(), myVect.end() );
    }

    std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

    return 0;
}

Compiled with g++ main.cpp -O3 -o main and got:

Time taken by map: 21.7142
Time taken by vect: 7.94725

map's 3 times slower...

Then, I said, "OK, vector is faster to fill and sort, but search will be faster with the map"....so I tested:

#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
    clock_t tStart = clock();

    {
        std::map myMap;
        float middle = 0;
        float last;
        for ( int i = 0; i != 10000000; ++i )
        {
            last = ((float)std::rand()) / RAND_MAX;
            myMap[ last ] = i;
            if ( i == 5000000 )
                middle = last; // element we will later search
        }

        std::cout << "Map created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

        float sum = 0;
        for ( int i = 0; i != 10; ++i )
            sum += myMap[ last ]; // search it

        std::cout << "Sum is " << sum << std::endl;
    }

    std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

    tStart = clock();

    {
        std::vector< std::pair > myVect;
        std::pair middle;
        std::pair last;
        for ( int i = 0; i != 10000000; ++i )
        {
            last = std::make_pair( ((float)std::rand()) / RAND_MAX, i );
            myVect.push_back( last );
            if ( i == 5000000 )
                middle = last; // element we will later search
        }

        std::sort( myVect.begin(), myVect.end() );

        std::cout << "Vector created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

        float sum = 0;
        for ( int i = 0; i != 10; ++i )
            sum += (std::find( myVect.begin(), myVect.end(), last ))->second; // search it

        std::cout << "Sum is " << sum << std::endl;
    }

    std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;

    return 0;
}

Compiled with g++ main.cpp -O3 -o main and got:

Map created after 19.5357
Sum is 1e+08
Time taken by map: 21.41
Vector created after 7.96388
Sum is 1e+08
Time taken by vect: 8.31741

Even search is apparently faster with the vector (10 searchs with the map took almost 2sec and it took only half a second with the vector)....

So:

Did I miss something?
Is my tests not correct/accurate?
Is map simply a class to avoid or is there really situations where map offers good performances?

Mark Ransom · Accepted Answer

Generally a map will be better when you're doing a lot of insertions and deletions interspersed with your lookups. If you build the data structure once and then only do lookups, a sorted vector will almost certainly be faster, if only because of processor cache effects. Since insertions and deletions at arbitrary locations in a vector are O(n) instead of O(log n), there will come a point where those will become the limiting factor.

I which situation will std::map<A,B> be faster than sorted std::vector<std::pair<A,B>>?

Answers (2)

Related Questions

I which situation will std::map&lt;A,B&gt; be faster than sorted std::vector&lt;std::pair&lt;A,B&gt;&gt;?

Answers (2)

Related Questions

I which situation will std::map<A,B> be faster than sorted std::vector<std::pair<A,B>>?