Reputation: 279385
Here's some code (full program follows later in the question):
template <typename T>
T fizzbuzz(T n) {
T count(0);
#if CONST
const T div(3);
#else
T div(3);
#endif
for (T i(0); i <= n; ++i) {
if (i % div == T(0)) count += i;
}
return count;
}
Now, if I call this template function with int
, then I get a factor of 6 performance difference according to whether I define CONST or not:
$ gcc --version
gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)
$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=0" &&
time ./wrappedint
g++ -O3 -Wall -Werror -DWRAP=0 -DCONST=0 wrappedint.cpp -o wrappedi
nt
484573652
real 0m2.543s
user 0m2.059s
sys 0m0.046s
$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=1" &&
time ./wrappedint
g++ -O3 -Wall -Werror -DWRAP=0 -DCONST=1 wrappedint.cpp -o wrappedi
nt
484573652
real 0m0.655s
user 0m0.327s
sys 0m0.046s
Examining the disassembly shows that in the fast (const) case, the modulo has been turned into a multiplication and shift type thing, whereas in the slow (non-const) case it's using idivl
.
Even worse, if I try to wrap my integer in a class, then the optimisation doesn't happen whether I use const or not. The code always uses idivl
and runs slow:
#include <iostream>
struct WrappedInt {
int v;
explicit WrappedInt(const int &val) : v(val) {}
bool operator<=(const WrappedInt &rhs) const { return v <= rhs.v; }
bool operator==(const WrappedInt &rhs) const { return v == rhs.v; }
WrappedInt &operator++() { ++v; return *this; }
WrappedInt &operator+=(const WrappedInt &rhs) { v += rhs.v; return *this; }
WrappedInt operator%(const WrappedInt &rhs) const
{ return WrappedInt(v%rhs.v); }
};
std::ostream &operator<<(std::ostream &s, WrappedInt w) {
return s << w.v;
}
template <typename T>
T fizzbuzz(T n) {
T count(0);
#if CONST
const T div(3);
#else
T div(3);
#endif
for (T i(0); i <= n; ++i) {
if (i % div == T(0)) count += i;
}
return count;
}
int main() {
#if WRAP
WrappedInt w(123456789);
std::cout << fizzbuzz(w) << "\n";
#else
std::cout << fizzbuzz<int>(123456789) << "\n";
#endif
}
My questions are:
1) Is there a simple principle of C++ itself, or gcc's optimisation, which explains why this happens, or is it just a case of "various heuristics run, this is the code you get"?
2) Is there any way to make the compiler realise that my locally-declared and never-referenced const WrappedInt can be treated as a compile-time const value? I want this thing to be a straight replacement for int in templates.
3) Is there a known way of wrapping an int such that the compiler can discard the wrapping when optimising? The goal is that WrappedInt will be a policy-based template. But if a "do-nothing" policy results in essentially arbitrary 6x speed penalties over int, I'm better off special-casing that situation and using int directly.
Upvotes: 9
Views: 3735
Reputation: 554
The difference in speed is caused by the compiler not knowing if "div" will change value. When it is non-const, it is treating it like a variable being passed in. It could be anything, and so the compiler will use an instruction that divides two variables - idivl. When you say that it's const, the compiler is free to treat it exactly as if you'd typed:
if (i % 3 == 0)
I'm kind of surprised that it didn't use bitwise AND(&).
The WrappedInt isn't being optimized because, well, its not an int. Its a class.
Something that you could do is incorporate fizzbuzz into WrappedInt.
Upvotes: 0
Reputation: 36459
I'm guessing its just the severely old GCC version you are running. The oldest compiler I have on my machine - gcc-4.1.2, performs the fast way with both the non-const and the wrap versions (and does so at only -O1).
Upvotes: 7
Reputation: 25313
Is there a known way of wrapping an int such that the compiler can discard the wrapping when optimising?
Try passing WrappedInt
by value. Then WrappedInt
s can be passed in registers. Pass-by-const-reference sometimes forces gcc to fall back to the stack for argument passing.
About the int
vs const int
slowdown, I can only speculate that GCC is trying to play it safe in the face of aliasing. Basically, if it cannot prove that div
is not an alias for another, more accessible variable, it cannot turn it into a constant. If you declare it const, GCC assumes it's not aliased and performs the conversion into a constant. Apart from the idivl
, you should also see a memory fetch, once, when entering the function, as opposed to immediate values being used for the const
case.
Upvotes: 0
Reputation: 308520
Try combining const int v
in your WrappedInt class with const T
in your fizzbuzz function and see if the compiler can optimize that.
By declaring const int
you've created a special case - a compile time constant. The compiler knows what the value is, and can optimize it more heavily than a value that could possibly change during the run of the program.
Upvotes: 1