Reputation: 1124
I have an example of code where a straightforward optimization is not working when structured as class variables, yet works as local variables; I want to know: why is the optimization not happening on the class variables formulation?
The intent of my example code is to have a class that is either enabled or disabled at construction and possibly changed during it's lifetime. I expect that, when the object is disabled for it's whole lifetime, the compiler would optimize away all code that conditionally executes when the object is enabled.
Specifically, I have a std::ofstream that I only want to write to when "enabled". When disabled, I want all formatted-output to be skipped. ( My real class does it's own, non-trivial message-formatting. )
I discovered that when I formulate this as a class, I don't get the optimizations I expect. However, if I replicate the code all as local variables, I do see the expected behavior.
Additionally, I discovered that if I don't make std::ofstream calls like 'open', 'exceptions', or 'clear' anywhere in the body of the example class's methods, I also get the expected optimizations. ( However, my design requires making such calls on std::ofstream, so for me it's a moot point. ) The below code uses the MACRO DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR to allow one to try this case.
My example code uses 'asm' expressions to insert comments into the generated assembly-code. If one inspects the output of the compiler in assembly, I expect there to be no assembly between the 'disabled-test' comments. I'm observing assembly between the 'class disabled-test' comments, yet no assembly between the 'locals disabled-test' comments.
The input C++ code:
#include <fstream> // ofstream
#define DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR 0
class Test_Ofstream
{
public:
Test_Ofstream( const char a_filename[],
bool a_b_enabled )
#if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
: m_ofstream( a_filename ),
m_b_enabled( a_b_enabled )
{
}
#else
: m_ofstream(),
m_b_enabled( a_b_enabled )
{
m_ofstream.open( a_filename );
}
#endif
void write_test()
{
if( m_b_enabled )
{
m_ofstream << "Some text.\n";
}
}
private:
std::ofstream m_ofstream;
bool m_b_enabled;
};
int main( int argc, char* argv[] )
{
{
Test_Ofstream test_ofstream( "test.txt", true );
asm( "# BEGIN class enabled-test" );
test_ofstream.write_test();
asm( "# END class enabled-test" );
}
{
Test_Ofstream test_ofstream( "test.txt", false );
asm( "# BEGIN class disabled-test" );
test_ofstream.write_test();
asm( "# END class disabled-test" );
}
{
bool b_enabled = true;
#if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
std::ofstream test_ofstream( "test.txt" );
#else
std::ofstream test_ofstream;
test_ofstream.open( "test.txt" );
#endif
asm( "# BEGIN locals enabled-test" );
if( b_enabled )
{
test_ofstream << "Some text.\n";
}
asm( "# END locals enabled-test" );
}
{
bool b_enabled = false;
#if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
std::ofstream test_ofstream( "test.txt" );
#else
std::ofstream test_ofstream;
test_ofstream.open( "test.txt" );
#endif
asm( "# BEGIN locals disabled-test" );
if( b_enabled )
{
test_ofstream << "Some text.\n";
}
asm( "# END locals disabled-test" );
}
return 0;
}
The output assembly code:
##### Cut here. #####
#APP
# 53 "test_ofstream_optimization.cpp" 1
# BEGIN class disabled-test
# 0 "" 2
#NO_APP
cmpb $0, 596(%esp)
je .L22
movl $.LC1, 4(%esp)
movl %ebx, (%esp)
.LEHB9:
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
.LEHE9:
.L22:
#APP
# 55 "test_ofstream_optimization.cpp" 1
# END class disabled-test
# 0 "" 2
#NO_APP
##### Cut here. #####
#APP
# 116 "test_ofstream_optimization.cpp" 1
# BEGIN locals disabled-test
# 0 "" 2
# 121 "test_ofstream_optimization.cpp" 1
# END locals disabled-test
# 0 "" 2
#NO_APP
##### Cut here. #####
I realize that this is possibly tied to the compiler I'm using, which is: g++-4.6 (Debian 4.6.1-4) 4.6.1; compiler flags: -Wall -S -O2. However, this seems like such a simple optimization I find it hard to believe it could be the compiler messing up.
Any help, insight or guidance is greatly appreciated.
Upvotes: 0
Views: 482
Reputation: 72336
As you say, this will depend on compiler. But my guess:
The optimizer can prove that no other code can ever modify object bool b_enabled
, since it's local and you never take its address or bind a reference to it. The local version is easily optimized.
When DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
is true, the Test_Ofstream
constructor:
ofstream(const char*)
m_b_enabled
Since there are no operations between initializing test_ofstream.m_b_enabled
and testing it, this optimization is only a bit trickier, but it sounds like g++ still manages it.
When DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
is false, the Test_Ofstream
constructor:
ofstream
default constructorm_b_enabled
m_ofstream.open(const char*)
The optimizer is not allowed to assume that ofstream::open
will not change test_ofstream.m_b_enabled
. We know it shouldn't, but in theory that non-inline library function could figure out the complete object test_ofstream
which contains its 'this' argument, and modify it that way.
Upvotes: 0
Reputation: 146930
Pretty simple. When you write the code directly as a local variable, then the code is inlined and the compiler performs the constant folding. When you're in the class scope, then the code is not inlined and the value of m_b_enabled
is unknown, so the compiler has to perform the call. To prove that the code was semantically equal and perform this optimization, not just that call would have to be inlined, but every access to the class. The compiler may well decide that inlining the class would not yield sufficient benefit. Compilers can also choose not to inline code because they don't know how, and inline asm
expressions is exactly the kind of thing that could cause them to do it, as the compiler does not know how to handle your assembly code.
Usually, you would place a breakpoint and inspect the disassembly. That's what I'd do in Visual Studio, anyway. Inline assembler of any kind can be so damaging to the optimizer.
When I removed the assembler expressions, then Visual Studio inlined the code- and promptly didn't perform the optimization anyway. The problem with stacking optimization passes is that you can never get the right order to find all potential optimizations.
Upvotes: 5