miki
miki

Reputation:

What is the quickest way to find the maximum of two floats in C++?

What is the quickest way to find the maximum of two floats:

a)

y = std::max(x1, x2);

b)

if (x1 > x2)
    y = x1;
else
    y = x2;

c)

y = x1 > x2 ? x1 : x2;

Thanks

Upvotes: 3

Views: 9730

Answers (13)

peterchen
peterchen

Reputation: 41096

In presence of a decent optimizer, they are equivalent.

Upvotes: 0

Igor
Igor

Reputation: 27250

You can check it yourself on your system.

I did it for you on gcc - redhat. The results on my system, for 100,000 executions with x1 = 432943.5 and x2 = 434232.9

a) ~1200 uSec b) ~600 uSec c) ~600 uSec

EDIT: With -O2 optimization I got the same results in all 3 cases: ~110 uSec.

Of course, the actual result depends on many factors in your specific problem and system.

Upvotes: 5

Thomas L Holaday
Thomas L Holaday

Reputation: 13814

Intel x86 has instructions (FCOMI/FCOMIP/ FUCOMI/FUCOMIP) that provide rapid comparison of floating point values. Your CPU may have such instructions, too. The trick is figuring out what C++ to write in order to maximize the chances of your compiler using those instructions instead of doing something slower but more generic.

The optimistic suggestion is to use std::max(float, float) in the hopes that someone else ignored those who taunt "microbenchmark" and "premature optimization" and has done the research necessary to provide a specialization for std::max(float, float) that will use the your hardware's specialized instructions.

Upvotes: 3

JaredPar
JaredPar

Reputation: 754715

Here's a different question

Why do you think such a small optimization will matter in the greater context of your program?

I find it highly unlikely that a micro-optimization such as this will have an perceivable impact on your program. You should never micro-optimize like this unless a profiler has specifically shown this to be a problem.

EDIT Adding some clarification buried in the comments

The reason there is no great answer to this question is that the performance of that code is highly dependent upon ...

  1. The way in which it's used in your program
  2. The particular compiler you're using
  3. The optimization flags passed to your compiler
  4. The particular architecture you are running the code on
  5. Many other very tiny things that weren't included in the question

Even if all of this information was included, our answers would be guesses at best. The only way to answer this question is to whip out a profiler and find out which is faster.

However this is almost certainly not worth the effort. Micro-optimizing such a small piece of your program will almost certainly not add any noticable preformance benefits to your code. In general it's a really bad idea to optimize code like this unless the profiler specifically tells you it's a problem. Otherwise you'll spend a lot of time optimizing something for no percievable benefit.

Yes there are cases where such an optimization could be important. But that would be only in a very special circumstances where the code was a part of a very tight highly called loop. However the only way to identify such code is to use a profiler.

Upvotes: 24

Adam Rosenfield
Adam Rosenfield

Reputation: 400274

First, as others have said, profile your code and make absolute sure that this is something worth optimizing. If it is, read on: you can do it without branching. See Down with fcmp: Conditional Moves For Branchless Math for more details.

Upvotes: 2

user75832
user75832

Reputation:

-O3 Dual core Macbook pro 2.4ghz

std::max(x1, x2) Time: 4.19488 RMAAx's : 4.19613 if Time: 4.18775 ? Time: 4.18831

std::max(x1, x2) Time: 4.1836 RMAAx's : 4.18274 if Time: 4.18603 ? Time: 4.18857

std::max(x1, x2) Time: 4.18714 RMAAx's : 4.18759 if Time: 4.19752 ? Time: 4.19797

std::max(x1, x2) Time: 4.1926 RMAAx's : 4.19293 if Time: 4.19334 ? Time: 4.19626

std::max(x1, x2) Time: 4.18963 RMAAx's : 4.19628 if Time: 4.19253 ? Time: 4.19107

#include <iostream>

using namespace std;

int main (int argc, char * const argv[]) {

    uint64_t iterations = 10000000000;
    float x1 = 3455.232;
    float x2 = 7456.856;
    float y = 0;

    for (int count = 0; count < 5; ++count)
    {       
        clock_t begin_time = clock();
        for (uint64_t ii = 0; ii < iterations; ++ii)
        {
            y = std::max(x1, x2);
        }

        std::cout << "std::max(x1, x2) Time: " << float( clock () - begin_time ) /  CLOCKS_PER_SEC << endl;


        begin_time = clock();
        for (uint64_t ii = 0; ii < iterations; ++ii)
        {
            y = x1;
            if (y < x2)
                y = x2;
        }

        std::cout << "RMAAx's : " << float( clock () - begin_time ) /  CLOCKS_PER_SEC << endl;


        begin_time = clock();
        for (uint64_t ii = 0; ii < iterations; ++ii)
        {
            if (x1 > x2)
                y = x1;
            else
                y = x2;
        }

        std::cout << "if Time: " << float( clock () - begin_time ) /  CLOCKS_PER_SEC << endl;


        begin_time = clock();
        for (uint64_t ii = 0; ii < iterations; ++ii)
        {
            y = x1 > x2 ? x1 : x2;
        }

        std::cout << "? Time: " << float( clock () - begin_time ) /  CLOCKS_PER_SEC << endl;
    }

    return 0;
}

Upvotes: 4

Matt J
Matt J

Reputation: 45193

It is common to #define either b or c as a macro, and use that throughout your code. Note that this only works in most cases, and will die if you pass in an x or y argument modified by a non-idempotent operator or function call with side effects. For example:

#define MAX(x,y) (((x) < (y)) ? (y) : (x))
...
MAX(i++, ++j); //won't work properly, the ++'s will get executed twice.
MAX(changeKSomehow(k), changeLSomehow(L)); //won't work, the functions will get called twice.

It turns out that std::max, at least for GNU libstdc++, is implemented nearly identically, and uses the inline hint. The compiler should be able to take the hint when appropriate (when the < operator takes few enough instructions not to cause a huge amount of I$ pressure if inlined).

Upvotes: 0

RMAAlmeida
RMAAlmeida

Reputation: 304

And this way may be better?

y = x1;
if (y < x2)
    y = x2;

Removing the else condition may be better interpreted by the compiler.

Edit1: If benchmarking don't forget to do half the test with x1 greater then x2 and the other half with x2 bigger. Otherwise the results would not reflect real cases.

Edit2: Microoptimazions are somehow usefull if you work in embedded systems with 1 or 2k of memory. And also, its an intersting problem to think why the timings are different in each case.

Upvotes: 2

pauljwilliams
pauljwilliams

Reputation: 19225

Benchmark them and find out.

Upvotes: 2

17 of 26
17 of 26

Reputation: 27382

The only way to really know is to measure them. They might vary from compiler to compiler or platform to platform.

Run a loop of each one for 100,000 or 500,000 iterations and compare the total execution times.

Upvotes: 2

Cătălin Pitiș
Cătălin Pitiș

Reputation: 14341

Same observation as usual, when it comes to "quickest". Did you measure the execution time of you maximum calculation, reported to the rest of execution time of the process? Does this "optimization" decision have a significant impact on the execution time of your application?

I'm 99% sure that the difference between your proposals doesn't worth considering.

Upvotes: 2

stefaanv
stefaanv

Reputation: 14392

It is compiler specific, but I would suspect that the result is the same. Have a look at the assembly output of the compiler.

Upvotes: 4

Daniel A. White
Daniel A. White

Reputation: 190945

B and C would compile the same at least in theory. I pick those because unless std::max is not a function call (e.g. a macro), those would be the fastest.

Edit Apparently, std::max is a template function call like your form C. http://www.cplusplus.com/reference/algorithm/max/

Upvotes: 2

Related Questions