Reputation: 2691
// if I know that in_x will never be bigger than Max
template <unsigned Max>
void foo(unsigned in_x)
{
unsigned cap = Max;
// I can tell the compiler this loop will never run more than log(Max) times
for (; cap != 0 && in_x != 0; cap >>= 1, in_x >>= 1)
{
}
}
As shown in the above code, my guess is that if I just write
for (; in_x != 0; in_x >>= 1)
the compiler won't unroll the loop, for it cannot be sure about the maximum possible in_x.
I wish to know if I'm right or wrong, and if there are some better ways to deal with such things.
Or maybe the problem can be generalized as if one can write some code to tell the compiler the range of some run-time value, and such code is not necessarily be compiled into the run-time binary.
Truly, fighting with the compiler XD
// with MSC
// if no __forceinline here, unrolling is ok, but the function will not be inlined
// if I add __forceinline here, lol, the entire loop is unrolled (or should I say the tree is expanded)...
// compiler freezes when Max is something like 1024
template <int Max>
__forceinline void find(int **in_a, int in_size, int in_key)
{
if (in_size == 0)
{
return;
}
if (Max == 0)
{
return;
}
{
int m = in_size / 2;
if ((*in_a)[m] >= in_key)
{
find<Max / 2>(in_a, m, in_key);
}
else
{
*in_a = *in_a + m + 1;
find<Max - Max / 2 - 1>(in_a, in_size - (m + 1), in_key);
}
}
}
Upvotes: 5
Views: 331
Reputation: 8431
The proper way to achieve this kind of behavior is to un roll the loop yourself using TMP. Even with this, you'll be relying on the compiler cooperation for massive inlining (which is not granted). Have a look at the following code to see if it helps:
template <unsigned char MaxRec>
inline void foo(unsigned in_x)
{
if (MaxRec == 0) // will be eliminated at compile time
return; // tells the compiler to stop the pseudo recursion
if (in_x == 0) {
// TODO : end recursion;
return;
};
// TODO: Process for iteration rec
// Note: NOT recursion, the compiler would not be able to inline
foo<MaxRec-1>(in_x >> 1);
}
// Usage:
foo<5>(in_x); // doubt the compiler will inline 32 times, but you can try.
Upvotes: 3