Reputation: 21514
I was using a map
in some code to store ordered data. I found out that for huge maps, destruction could take a while. In this code I had, replacing map
by vector<pair>
reduced processing time by 10000...
Finally, I was so surprised that I decided to compare map
performances with sorted vector
or pair
.
And I'm surprised because I could not find a situation where map
was faster than a sorted vector
of pair
(filled randomly and later sorted)...there must be some situations where map
is faster....else what's the point in providing this class?
Here is what I tested:
Test one, compare map
filling and destroying vs vector
filling, sorting (because I want a sorted container) and destroying:
#include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
for ( int i = 0; i != 10000000; ++i )
{
myMap[ ((float)std::rand()) / RAND_MAX ] = i;
}
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
for ( int i = 0; i != 10000000; ++i )
{
myVect.push_back( std::make_pair( ((float)std::rand()) / RAND_MAX, i ) );
}
// sort the vector, as we want a sorted container:
std::sort( myVect.begin(), myVect.end() );
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
Compiled with g++ main.cpp -O3 -o main
and got:
Time taken by map: 21.7142
Time taken by vect: 7.94725
map
's 3 times slower...
Then, I said, "OK, vector is faster to fill and sort, but search will be faster with the map"....so I tested:
#include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
float middle = 0;
float last;
for ( int i = 0; i != 10000000; ++i )
{
last = ((float)std::rand()) / RAND_MAX;
myMap[ last ] = i;
if ( i == 5000000 )
middle = last; // element we will later search
}
std::cout << "Map created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += myMap[ last ]; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
std::pair<float,int> middle;
std::pair<float,int> last;
for ( int i = 0; i != 10000000; ++i )
{
last = std::make_pair( ((float)std::rand()) / RAND_MAX, i );
myVect.push_back( last );
if ( i == 5000000 )
middle = last; // element we will later search
}
std::sort( myVect.begin(), myVect.end() );
std::cout << "Vector created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += (std::find( myVect.begin(), myVect.end(), last ))->second; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
Compiled with g++ main.cpp -O3 -o main
and got:
Map created after 19.5357
Sum is 1e+08
Time taken by map: 21.41
Vector created after 7.96388
Sum is 1e+08
Time taken by vect: 8.31741
Even search is apparently faster with the vector
(10 searchs with the map
took almost 2sec and it took only half a second with the vector
)....
So:
map
simply a class to avoid or is there really situations where map
offers good performances?Upvotes: 5
Views: 1350
Reputation: 308186
Generally a map
will be better when you're doing a lot of insertions and deletions interspersed with your lookups. If you build the data structure once and then only do lookups, a sorted vector
will almost certainly be faster, if only because of processor cache effects. Since insertions and deletions at arbitrary locations in a vector are O(n) instead of O(log n), there will come a point where those will become the limiting factor.
Upvotes: 6
Reputation: 171178
std::find
has linear time complexity whereas a map
search has log N complexity.
When you find that one algorithm is 100000x faster than the other you should get suspicious! Your benchmark is invalid.
You need to compare realistic variants. Probably, you meant to compare map with a binary search. Run each of those variants for at least 1 second of CPU time so that you can realistically compare the results.
When a benchmark returns "0.00001 seconds" time spent you are well in the clock inaccuracy noise. This number means nothing.
Upvotes: 0