Reputation: 481
I have 3 solutions to the following problem from Project Euler.
If p is the perimeter of a right angle triangle with integral length sides, {a,b,c}, there are exactly three solutions for p = 120.
{20,48,52}, {24,45,51}, {30,40,50}
For which value of p ≤ 1000, is the number of solutions maximised?
My three solutions for each language are available below.
C++:
boost::chrono::steady_clock::time_point start_time = boost::chrono::steady_clock::now();
map<int, int> square_lookup;
for(int i=0; i<= 1500; i++) {
square_lookup[i*i] = i ;
}
auto end_time = boost::chrono::steady_clock::now();
Python2:
start = time.time()
res = range(1, 1501)
squares = {}
#square_lookups = dict(zip([x*x for x in res], res))
square_lookups = {}
for x in range(1, 1501):
square_lookups[x*x] = x
end = time.time()
Ruby:
start_time = Time.now
square_lookup = {}
(1 .. 1500).map {|x| x*x}.each_with_index do |square, root|
square_lookup[square] = root+1
end
end_time = Time.now
Timings on a quadcore i5:
> lookup gen time: 0.00141787528992
> Python Result: 840 Time:
> 0.282248973846
>
> Lookup gen time 4640960 nanoseconds
> C++: Result: 840 : Time: 695301578 nanoseconds
>
>
> Lookup gen time 0.000729416
> Ruby: Result: 840 Time: 0.149393345
The lookup gen time is the time it takes to construct a hash table with 1500 elements, with the keys being a perfect square and the values being their respective roots.
Even in this respect, C++ is still SLOWER than both Python and Ruby. I realize I may have the most efficient solution overall and for each language, but using the same types of operations still show C++ as being extremely slow.
IMPORTANT EDIT I changed map
to use unordered_map
for the C++ solution, however it is still slower!
Modified C++ file: http://pastebin.com/2YyB6Rfm
lookup gen time: 0.00134301185608
Python Result: 840 Time: 0.280808925629
Lookup gen time 2021697 nanoseconds
C++: Result: 840 : Time: 392731891 nanoseconds
Lookup gen time 0.000729313
Ruby: Result: 840 Time: 0.148183345
Upvotes: 3
Views: 331
Reputation: 490663
Your code has another serious problem--much more serious than map
vs. unordered_map
(at least IMO).
In particular, where you do:
int result = square_lookup[(i*i) + (j*j)];
if(result) {
int perimeter = i + j + result;
if(perimeter <= 1000) {
occurences[perimeter] += 1;
}
}
This code doesn't just look for the value i*i+j*j
in the existing map. Rather, if a key isn't present in the map, it inserts a node in the map with i*i+j*j
as the key, and 0
(or, more specifically, a value-initialized object of the map's value_type
, which is int
in this case) into the map.
Inserting nodes in the map for all those values you don't care about is quite slow. What you're trying to do here is really just check whether the value is already in the map. For that, you can use code like this:
auto result = square_lookup.find(i*i + j*j);
if (result!=square_lookup.end()) {
int perimeter = i + j + result->second;
if (perimeter <= 1000)
++occurences[perimeter];
}
This uses find
to find whether a key is in the map. Then if (and only if) the key is in the map, it looks up the value currently associated with that key.
This gives a substantial improvement in speed--to about 20-30 ms with either VC++ or g++.
With this change in place, the difference between map
and unordered_map
also shrinks away. Code using map
can still run in ~20-30 milliseconds. The code using unordered_map
may be just a tiny bit faster on average, but my system clock only has 10 ms granularity, so I'd really have to test with more data to say for sure.
For reference, here's the code as I ran it (note that I did some other general cleanup of the code, but nothing else should have any major effect on speed):
#include <iostream>
#include <unordered_map>
#include <chrono>
#include <iterator>
#include <algorithm>
#include <utility>
#include <map>
using namespace std;
int main() {
auto start_time = chrono::steady_clock::now();
map<int, int> square_lookup;
int ctr = 0;
generate_n(inserter(square_lookup, square_lookup.end()),
1500,
[&]() { ++ctr; return make_pair(ctr*ctr, ctr); });
auto end_time = chrono::steady_clock::now();
cout << "Lookup gen time "
<< chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << "\n";
map<int, int> occurences;
typedef std::pair<int, int> const &map_t;
for (int i = 0; i <= 1000; i++) {
for (int j = i; j <= 1000; j++) {
auto result = square_lookup.find(i*i + j*j);
if (result != square_lookup.end()) {
int perimeter = i + j + result->second;
if (perimeter <= 1000)
++occurences[perimeter];
}
}
}
auto it = std::max_element(occurences.begin(), occurences.end(),
[](map_t a, map_t b) { return a.second < b.second; });
end_time = chrono::steady_clock::now();
cout << "C++: Result: " << it->first << " : Time: "
<< chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << "\n";
}
Summary: in C++ the []
operator on map
will insert an item if it's not already present. That can be handy, but it's not always what you want. If you only want to retrieve a value if it's already present, it's not the right tool for the job--and .find
can be substantially faster.
Once you correct that problem, the difference between map
and unordered_map
(at least mostly) disappears.
Upvotes: 11
Reputation: 303937
You claim you're timing
The lookup gen time is the time it takes to construct a hash table with 1500 elements, with the keys being a perfect square and the values being their respective roots.
That is true for the Python and Ruby solutions, but in the C++ example you're constructing a std::map<int, int>
. That is not a hash table - it is a red black tree. Inserting and lookup is O(lg N)
, not O(1)
.
To get a fair comparison, you want to use std::unordered_map<int, int>
as your type. That is a real hash table.
Upvotes: 9