Charles L Wilcox
Charles L Wilcox

Reputation: 1124

Local Variables vs. Class Variables Compiler Optimization; Works vs. Doesn't Work

I have an example of code where a straightforward optimization is not working when structured as class variables, yet works as local variables; I want to know: why is the optimization not happening on the class variables formulation?

The intent of my example code is to have a class that is either enabled or disabled at construction and possibly changed during it's lifetime. I expect that, when the object is disabled for it's whole lifetime, the compiler would optimize away all code that conditionally executes when the object is enabled.

Specifically, I have a std::ofstream that I only want to write to when "enabled". When disabled, I want all formatted-output to be skipped. ( My real class does it's own, non-trivial message-formatting. )

I discovered that when I formulate this as a class, I don't get the optimizations I expect. However, if I replicate the code all as local variables, I do see the expected behavior.

Additionally, I discovered that if I don't make std::ofstream calls like 'open', 'exceptions', or 'clear' anywhere in the body of the example class's methods, I also get the expected optimizations. ( However, my design requires making such calls on std::ofstream, so for me it's a moot point. ) The below code uses the MACRO DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR to allow one to try this case.

My example code uses 'asm' expressions to insert comments into the generated assembly-code. If one inspects the output of the compiler in assembly, I expect there to be no assembly between the 'disabled-test' comments. I'm observing assembly between the 'class disabled-test' comments, yet no assembly between the 'locals disabled-test' comments.

The input C++ code:

#include <fstream> // ofstream

#define DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR 0

class Test_Ofstream
{
public:
    Test_Ofstream( const char a_filename[],
                   bool a_b_enabled )
    #if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
        : m_ofstream( a_filename ),
          m_b_enabled( a_b_enabled )
    {
    }
    #else
        : m_ofstream(),
          m_b_enabled( a_b_enabled )
    {
        m_ofstream.open( a_filename );
    }
    #endif

    void write_test()
    {
        if( m_b_enabled )
        {
            m_ofstream << "Some text.\n";
        }
    }

private:
    std::ofstream m_ofstream;
    bool m_b_enabled;
};

int main( int argc, char* argv[] )
{
    {
        Test_Ofstream test_ofstream( "test.txt", true );
        asm( "# BEGIN class enabled-test" );
        test_ofstream.write_test();
        asm( "# END class enabled-test" );
    }

    {
        Test_Ofstream test_ofstream( "test.txt", false );
        asm( "# BEGIN class disabled-test" );
        test_ofstream.write_test();
        asm( "# END class disabled-test" );
    }

    {
        bool b_enabled = true;
        #if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
        std::ofstream test_ofstream( "test.txt" );
        #else
        std::ofstream test_ofstream;
        test_ofstream.open( "test.txt" );
        #endif
        asm( "# BEGIN locals enabled-test" );
        if( b_enabled )
        {
            test_ofstream << "Some text.\n";
        }
        asm( "# END locals enabled-test" );
    }

    {
        bool b_enabled = false;
        #if DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR
        std::ofstream test_ofstream( "test.txt" );
        #else
        std::ofstream test_ofstream;
        test_ofstream.open( "test.txt" );
        #endif
        asm( "# BEGIN locals disabled-test" );
        if( b_enabled )
        {
            test_ofstream << "Some text.\n";
        }
        asm( "# END locals disabled-test" );
    }

    return 0;
}

The output assembly code:

##### Cut here. #####
#APP
# 53 "test_ofstream_optimization.cpp" 1
        # BEGIN class disabled-test
# 0 "" 2
#NO_APP
        cmpb        $0, 596(%esp)
        je  .L22
        movl        $.LC1, 4(%esp)
        movl        %ebx, (%esp)
.LEHB9:
        call        _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
.LEHE9:
.L22:
#APP
# 55 "test_ofstream_optimization.cpp" 1
        # END class disabled-test
# 0 "" 2
#NO_APP
##### Cut here. #####
#APP
# 116 "test_ofstream_optimization.cpp" 1
        # BEGIN locals disabled-test
# 0 "" 2
# 121 "test_ofstream_optimization.cpp" 1
        # END locals disabled-test
# 0 "" 2
#NO_APP
##### Cut here. #####

I realize that this is possibly tied to the compiler I'm using, which is: g++-4.6 (Debian 4.6.1-4) 4.6.1; compiler flags: -Wall -S -O2. However, this seems like such a simple optimization I find it hard to believe it could be the compiler messing up.

Any help, insight or guidance is greatly appreciated.

Upvotes: 0

Views: 482

Answers (2)

aschepler
aschepler

Reputation: 72336

As you say, this will depend on compiler. But my guess:

The optimizer can prove that no other code can ever modify object bool b_enabled, since it's local and you never take its address or bind a reference to it. The local version is easily optimized.

When DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR is true, the Test_Ofstream constructor:

  • Calls the constructor ofstream(const char*)
  • Initializes member m_b_enabled

Since there are no operations between initializing test_ofstream.m_b_enabled and testing it, this optimization is only a bit trickier, but it sounds like g++ still manages it.

When DISABLE_OPEN_OFSTREAM_AFTER_CONSTRUCTOR is false, the Test_Ofstream constructor:

  • Calls the ofstream default constructor
  • Initializes member m_b_enabled
  • Calls m_ofstream.open(const char*)

The optimizer is not allowed to assume that ofstream::open will not change test_ofstream.m_b_enabled. We know it shouldn't, but in theory that non-inline library function could figure out the complete object test_ofstream which contains its 'this' argument, and modify it that way.

Upvotes: 0

Puppy
Puppy

Reputation: 146930

Pretty simple. When you write the code directly as a local variable, then the code is inlined and the compiler performs the constant folding. When you're in the class scope, then the code is not inlined and the value of m_b_enabled is unknown, so the compiler has to perform the call. To prove that the code was semantically equal and perform this optimization, not just that call would have to be inlined, but every access to the class. The compiler may well decide that inlining the class would not yield sufficient benefit. Compilers can also choose not to inline code because they don't know how, and inline asm expressions is exactly the kind of thing that could cause them to do it, as the compiler does not know how to handle your assembly code.

Usually, you would place a breakpoint and inspect the disassembly. That's what I'd do in Visual Studio, anyway. Inline assembler of any kind can be so damaging to the optimizer.

When I removed the assembler expressions, then Visual Studio inlined the code- and promptly didn't perform the optimization anyway. The problem with stacking optimization passes is that you can never get the right order to find all potential optimizations.

Upvotes: 5

Related Questions