Reputation: 9632
Say you have an array of 2D points and are looking to find the min/max bounding box.
Which would be faster, the manual approach:
float min_x = min_y = highest, max_x = max_y = lowest;
for(auto p: points) {
max_x = max(max_x, p.x);
max_y = max(max_y, p.y);
min_x = min(min_x, p.x);
min_y = min(min_y, p.y);
}
Or using C++ tools:
auto[min_x, max_x] =
minmax_element(values.begin(), values.end(), [](auto p1, auto p2) { return p1.x < p2.x; });
auto[min_y, max_y] =
minmax_element(values.begin(), values.end(), [](auto p1, auto p2) { return p1.y < p2.y; });
I am wondering about which should be, in theory, faster. I don't care about the amount of time in ms that it takes in a specific machine to finish, I want to know which of these 2 I should expect to be faster, prior to bench marking.
Upvotes: 3
Views: 109
Reputation: 20087
I would bet for the first option, assuming that four statements of kind aN += bN
are first independent and secondly they can be parallelized, combined, and auto vectorized. And it of course helps, when there exists a single instruction opcode for min/max, as in x64/simd architecture. In this context, conditionally skipping the other comparison can be an example of premature optimization. Moreover, traversing an array only once should make a difference for large arrays due to memory access costing generally more than ALU operations.
Upvotes: 2
Reputation: 373462
In a very cool twist of algorithmic fate, it’s possible to find the minimum and maximum elements of an array together faster than it is to find each one separately. One possible algorithm to do this is to pair the elements off against one another, compare them, and then move the bigger elements into one elimination tournament to find the biggest element and the smaller elements into another elimination tournament to find the smallest element. You can show that this will use about 3n/2 comparisons, as opposed to computing the max and min each independently, which will require 2n comparisons. So in that sense, computing the min and max together should require about 25% fewer comparisons.
As to which is faster in practice, that’ll depend on your hardware and how good of an optimizing compiler you’ve got. On the one hand, minmax_element with a good optimizer should generate code that makes fewer comparisons, which could make it faster than the other approach. On the other hand, the other code is so simple that the optimizer might be able to unroll it to some depth and then go nuts in speeding it up. Or maybe comparisons aren’t that expensive and other factors will end up being more important to the efficiency.
Realistically, though, unless this code gets called in a tight loop and you’ve profiled the code to see that it’s the bottleneck in your program, worrying about things at this level of detail is likely not worth the investment (look up Amdahl’s Law - you can quantify how much or how little performance improvement you’ll get from focusing on something). Aim for clarity in your code and optimize it when you need to.
Upvotes: 2
Reputation: 234875
Always plump for the function from the C++ Standard Library:
Because it's extremely well-specified and tied to a particular compiler, that compiler can recognise the C++ Standard Library function and make optimisations. Perhaps the function is even hardcoded into the compiler?
minmax_element
was introduced since its result can be found in one traversal of the data. (In fact the standard does not allow it to make two separate traversals.)
Upvotes: 5