user3735658
user3735658

Reputation:

How do production compilers implement destructor handling on flow control

Long story short - I am writing a compiler, and reaching the OOP features I am faced with a dilemma involving the handling of destructors. Basically I have two options:

And I know the usual response to such questions is "do both, profile and decide" and that's what I would do if this was a trivial task, but writing a full featured compiler has proven somewhat arduous so I prefer to get some expert input rather than build two bridges, see which one does better and burn the other one.

I put c++ in the tags because that's the language I am using and am somewhat familiar with it and the RAII paradigm, which is what my compiler is modeling around as well.

Upvotes: 8

Views: 432

Answers (4)

Dwayne Robinson
Dwayne Robinson

Reputation: 2409

Compilers use a mix of both approaches. MSVC uses inline destructor calls for normal code flow and clean up code blocks in reverse order for early returns and exceptions. During normal flow, it uses a single hidden local integer to track constructor progress thus far, so it knows where to jump upon early returns. A single integer is sufficient because scope always forms a tree (rather than say using a bitmask for each class that has or has not been constructed successfully). For example, the following fairly short code using a class with a non-trivial destructor and some random returns sprinkled throughout...

    ...
    if (randomBool()) return;
    Foo a;
    if (randomBool()) return;
    Foo b;
    if (randomBool()) return;

    {
        Foo c;
        if (randomBool()) return;
    }

    {
        Foo d;
        if (randomBool()) return;
    }
    ...

...can expand to pseudocode like below on x86, where the constructor progress is incremented immediately after each constructor call (sometimes by more than one to the next unique value) and decremented (or 'popped' to an earlier value) immediately before each destructor call. Note that classes with trivial destructors do not affect this value.

    ...
    save previous exception handler // for x86, not 64-bit table based handling
    preallocate stack space for locals
    set new exception handler address to ExceptionCleanup
    set constructor progress = 0
    if randomBool(), goto Cleanup0
    Foo a;
    set constructor progress = 1 // Advance 1
    if randomBool(), goto Cleanup1
    Foo b;
    set constructor progress = 2 // And once more
    if randomBool(), goto Cleanup2

    {
        Foo c;
        set constructor progress = 3
        if randomBool(), goto Cleanup3
        set constructor progress = 2 // Pop to 2 again
        c.~Foo();
    }

    {
        Foo d;
        set constructor progress = 4 // Increment 2 to 4, not 3 again
        if randomBool(), goto Cleanup4
        set constructor progress = 2 // Pop to 2 again
        d.~Foo();
    }

// alternate Cleanup2
    set constructor progress = 1
    b.~Foo();
// alternate Cleanup1
    set constructor progress = 0
    a.~Foo();

Cleanup0:
    restore previous exception handler
    wipe stack space for locals
    return;

ExceptionCleanup:
    switch (constructor progress)
    {
    case 0: goto Cleanup0; // nothing to destroy
    case 1: goto Cleanup1;
    case 2: goto Cleanup2;
    case 3: goto Cleanup3;
    case 4: goto Cleanup4;
    }
    // admitting ignorance here, as I don't know how the exception
    // is propagated upward, and whether the exact same cleanup
    // blocks are shared for both early returns and exceptions.

Cleanup4:
    set constructor progress = 2
    d.~Foo();
    goto Cleanup2;
Cleanup3:
    set constructor progress = 2
    c.~Foo();
    // fall through to Cleanup2;
Cleanup2:
    set constructor progress = 1
    b.~Foo();
Cleanup1:
    set constructor progress = 0
    a.~Foo();
    goto Cleanup0;
    // or it may instead return directly here

The compiler may of course rearrange these blocks anyway it thinks is more efficient, rather than putting all the cleanup at the end. Early returns could jump instead to the alternate Cleanup1/2 at the end of the function. On 64-bit MSVC code, exceptions are handled via tables that map the instruction pointer of when the exception happened to respective code cleanup blocks.

Upvotes: 3

An optimizing compiler is transforming the internal representations of the compiled source code.

It usually build a directed (usually cyclic) graph of basic blocks. When building this control flow graph it is adding the call to the destructors.

For GCC (it is a free software compiler - and so is Clang/LLVM -, so you could study its source code), you probably could try to compile some simple C++ test case code with -fdump-tree-all and then see that it is done at gimplification time. BTW, you could customize g++ with MELT to explore its internal representations.

BTW, I don't think that how you deal with destructors is that important (notice that in C++ they are implicitly called at syntactically defined places, like } of their defining scope). Most of the work of such a compiler is in optimizing (then, dealing with destructors is not very relevant; they nearly are routines like others).

Upvotes: 1

InsertMemeNameHere
InsertMemeNameHere

Reputation: 2433

For the most part, a destructor call can be treated in the same manner as an ordinary function call.

The lesser part is dealing with EH. I've noticed MSC generates a mix of inlined destructor calls in "ordinary" code, and, for x86-64, creates separate cleanup code that itself may or may not have copies of destructor logic in it.

IMO, the simplest solution would be to always call nontrivial destructors as ordinary functions.

If optimization seems possible on the horizon, treat the aforementioned calls like anything else: Will it fit in the cache with everything else? Will doing this take up too much space in the image? Etc..

A frontend may insert "calls" to nontrivial destructors at the end of each actionable block in its output AST.

A backend may treat such things as ordinary function calls, wire them together, make a big block-o-destructor call logic somewhere and jump to that, etc...

Linking functions to the same logic seems quite common. For example, MSC tends to link all trivial functions to the same implementation, destructor or otherwise, optimizing or not.

This is primarily from experience. As usual, YMMV.

One more thing:

EH cleanup logic tends to work like a jump table: For a given function, you can just jump into a single list of destructor calls, depending on where an exception was thrown (if applicable).

Upvotes: 4

Mats Petersson
Mats Petersson

Reputation: 129374

I don't know how commercial compilers come up with the code, but assuming we ignore exceptions at this point [1], the approach I would take is to make a call to the destructor, not inline it. Each destructor would contain the complete destructor for that object. Use a loop to deal with destructors of arrays.

To inline the calls is an optimisation, and you shouldn't do that unless you "know it pays off" (code-size vs. speed).

You will need to deal with "destruction in the enclosing block", but assuming you don't have jumps out of the block, that should be easy. Jumps out of block (e.g. return, break, etc) will mean that you have to jump to a piece of code that cleans up the block you are in.

[1] Commercial compilers have special tables based on "where was the exception thrown", and a piece of code generated to do that cleanup - typically reusing the same cleanup for many exception points by having multiple jump labels in each chunk of cleanup.

Upvotes: 2

Related Questions