Reputation: 481
I want to be able to define a custom operator that works like this:
struct Matrix {
int inner[5];
static Matrix tensor_op(Matrix const& a, Matrix const& b);
};
int main()
{
Matrix a;
Matrix b;
Matrix c = a tensor c;
return 1;
}
The following code works perfectly, except that it does not optimize the intermediate object away:
template<char op> struct extendedop { };
template<typename T, char op>
struct intermediate {
T const& a;
intermediate(T const& a_) : a(a_) {}
};
template<typename T, char op>
intermediate<T, op> operator+(T const& a, extendedop<op> exop) {
return intermediate<T, op>(a);
}
template<typename T>
T operator+(intermediate<T, '*'> const& a, T const& b) {
return T::tensor_op(a.a, b);
}
#define tensor + extendedop<'*'>() +
As you can see in the decompiled assembly code when compiled with GCC and MSVC, only GCC can optimize the intermediate object away.
How can I make MSVC optimize away the unnecessary code?
Upvotes: 1
Views: 94
Reputation: 1300
It is optimized by MSVC. Look at the generated code for main()
(Using /GS-
to remove the security check, just to make it clearer):
$LN10:
sub rsp, 120 ; 00000078H
lea r8, QWORD PTR c$[rsp]
lea rdx, QWORD PTR a$[rsp]
lea rcx, QWORD PTR $T1[rsp]
call static Matrix Matrix::tensor_op(Matrix const &,Matrix const &) ; Matrix::tensor_op
mov eax, 1
add rsp, 120 ; 00000078H
ret 0
main ENDP
That's the same number of instructions as GCC. The construction of the intermediate object is elided.
The extra assembly generated on Compiler Explorer is just MSVC adding the code for the operators, in case they're used independently, and are not cleared up when not used. It's probably just a side-effect of it not having direct Assembly output as GCC/Clang do.
Upvotes: 1