manatttta
manatttta

Reputation: 3124

Function with template bool argument: guaranteed to be optimized?

In the following example of templated function, is the central if inside the for loop guaranteed to be optimized out, leaving the used instructions only?

If this is not guaranteed to be optimized (in GCC 4, MSVC 2013 and llvm 8.0), what are the alternatives, using C++11 at most?

NOTE that this function does nothing usable, and I know that this specific function can be optimized in several ways and so on. But all I want to focus is on how the bool template argument works in generating code.

template <bool IsMin>
float IterateOverArray(float* vals, int arraySize) {

    float ret = (IsMin ? std::numeric_limits<float>::max() : -std::numeric_limits<float>::max());

    for (int x = 0; x < arraySize; x++) {
        // Is this code optimized by the compiler to skip the unnecessary if?
        if (isMin) {
            if (ret > vals[x]) ret = vals[x];
        } else {
            if (ret < vals[x]) ret = vals[x];
        }
    }

    return val;

}

Upvotes: 4

Views: 311

Answers (3)

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275730

In theory no. The C++ standard permits compilers to be not just dumb, but downright hostile. It could inject code doing useless stuff for no reason, so long as the abstract machine behaviour remains the same.1

In practice, yes. Dead code elimination and constant branch detection are easy, and every single compiler I have ever checked eliminates that if branch.

Note that both branches are compiled before one is eliminated, so they both must be fully valid code. The output assembly behaves "as if" both branches exist, but the branch instruction (and unreachable code) is not an observable feature of the abstract machine behaviour.

Naturally if you do not optimize, the branch and dead code may be left in, so you can move the instruction pointer into the "dead code" with your debugger.


1 As an example, nothing prevents a compiler from implementing a+b as a loop calling inc in assembly, or a*b as a loop adding a repeatedly. This is a hostile act by the compiler on almost all platforms, but not banned by the standard.

Upvotes: 6

Telokis
Telokis

Reputation: 3389

Since you ask for an alternative in C++11 here is one :

float   IterateOverArrayImpl(float* vals, int arraySize, std::false_type)
{
    float ret = -std::numeric_limits<float>::max();
    for (int x = 0; x < arraySize; x++) {
        if (ret < vals[x])
            ret = vals[x];
    }
    return ret;
}

float   IterateOverArrayImpl(float* vals, int arraySize, std::true_type)
{
    float ret = std::numeric_limits<float>::max();
    for (int x = 0; x < arraySize; x++) {
        if (ret > vals[x])
            ret = vals[x];
    }
    return ret;
}

template <bool IsMin>
float IterateOverArray(float* vals, int arraySize) {

    return IterateOverArrayImpl(vals, arraySize, std::integral_constant<bool, IsMin>());

}

You can see it in live here.

The idea is to use function overloading to handle the test.

Upvotes: 1

NathanOliver
NathanOliver

Reputation: 180935

There is no guarantee that it will be optimized away. There is a pretty good chance that it will be though since it is a compile time constant.

That said C++17 gives us if constexpr which will only compile the code that pass the check. If you want a guarantee then I would suggest you use this feature instead.

Before C++17 if you only want one part of the code to be compiled you would need to specialize the function and write only the code that pertains to that specialization.

Upvotes: 1

Related Questions