Reputation: 60097

Special treatment of setjmp/longjmp by compilers

In Why volatile works for setjmp/longjmp, user greggo comments:

Actually modern C compilers do need to know that setjmp is a special case, since there are, in general, optimizations where the change of flow caused by setjmp could badly corrupt things, and these need to be avoided. Back in K&R days, setjmp did not need special handling, and didn't get any, and so the caveat about locals applied. Since that caveat is already there and (should be!) understood - and of course, setjmp use is pretty rare - there is no incentive for modern compilers to go to any extra lengths to fix the 'clobber' issue -- it would still be in the language.

Are there any references that elaborate on this and if this is true, can there safely exist (with behavior no more error-prone than that of standard setjmp/longjmp) custom-made implementations of setjmp/longjmp (e.g., maybe I'd like to save some extra (thread-local) context) that are named something different? Like is there anyway to tell compilers "this function is effectively setjmp/longjmp"?

Upvotes: 3

Answers (3)

amonakov

Reputation: 2409

The C language defines setjmp to be a macro and places strict limitations on context in which it may appear without invoking undefined behavior. It is not a normal function: you cannot take its address and expect a call via the resulting pointer to behave as a proper setjmp invocation.

In particular, it is not true in general that assembly code invoked by setjmp obeys the same calling conventions as normal functions. SPARC on Linux and Solaris provides a counterexample: its setjmp does not restore all call-preserved registers (nor does vfork). It took GCC by surprise as recently as 2018 (gcc-patches thread, bugzilla entry).

But even considering "compiler-friendly" platforms where setjmp entrypoint obeys the usual conventions, it is still necessary to recognize it as a function that "returns twice". GCC recognizes setjmp-like functions (including vfork) by name, and offers __attribute__((returns_twice)) for annotating such functions in custom code.

The reason for that is longjmp'ing back to setjmp can transfer control from a point where some variable or temporary appears dead (and the compiler reused its storage for something unrelated) back to where it was live (but its storage is "clobbered" now, oops).

Constructing an example that demonstrates how this happens is a bit tricky: the clobbered storage cannot be a register, because if it's call-clobbered it wouldn't be in use at the point of setjmp, and if it is call-saved longjmp would restore it (SPARC exception aside). So it needs to be forced to stack without making addresses of both variables exposed in a way that makes their lifetimes overlap, preventing reuse of stack slots, and without making one of them go out of scope before longjmp.

With a bit of luck I managed to arrive at the following testcase, which when compiled with -O2 -mtune-ctrl=^inter_unit_moves_from_vec (view on Compiler Explorer):

//__attribute__((returns_twice))
int my_setjmp(void);

__attribute__((noreturn))
void my_longjmp(int);

static inline
int float_as_int(float x)
{
    return (union{float f; int i;}){x}.i;
}

float f(void);

int g(void)
{
    int ret = float_as_int(f());

    if (__builtin_expect(my_setjmp(), 1)) {
        int tmp = float_as_int(f());
        my_longjmp(tmp);
    }
    return ret;
}

produces the following assembly:

g:
        sub     rsp, 24
        call    f
        movss   DWORD PTR [rsp+12], xmm0
        call    my_setjmp
        test    eax, eax
        je      .L2
        call    f
        movss   DWORD PTR [rsp+12], xmm0
        mov     edi, DWORD PTR [rsp+12]
        call    my_longjmp
.L2:
        mov     eax, DWORD PTR [rsp+12]
        add     rsp, 24
        ret

The -mtune-ctrl=^inter_unit_moves_from_vec flag causes GCC to implement SSE-to-gpr moves via stack, and both moves use the same stack slot, because as far as the compiler can tell, there's no conflict (computing 'tmp' leads to a noreturn function, so temporary used for computing 'ret' is no longer needed). However, if my_longjmp transfers control back to my_setjmp, after branching to label .L2 we try to read the value of 'ret' from the overwritten slot.

Upvotes: 3

Peter Cordes

Reputation: 365207

GCC does do a bit of special handling for setjmp, matching it by name along with sigsetjmp, vfork, getcontext, and savectx. (After stripping leading _). On a match it sets the internal flag ECF_RETURNS_TWICE. I think this is equivalent to an implicit __attribute__((returns_twice)) (which you can use for your own functions). The glibc headers don't use that, they just rely on the name matching. (An earlier version of this answer was fooled by that into thinking they weren't special-cased at all.)

longjmp doesn't need much special handling; it just looks like any other __attribute__((noreturn)) function call. Glibc declares longjmp that way, which should make side-effects on locals happen before a call to it, and for example avoids warnings about execution falling off the end of a non-void function in something like int foo(){ if(x) return y; longjmp(jmpbuf); }

setjmp / longjmp don't guarantee much more than what any opaque function (not inlinable) would look like for the optimizer. (But one key difference involves not reusing stack space for separate locals when one could come back into scope when setjmp returns again, see @amonakov's answer.)

Side effects on non-volatile locals might have been re-ordered at compile time wrt. setjmp (or longjmp) if escape analysis can show that no global variable could have their address.

Optimization is still allowed to keep locals in registers instead of memory during a call to setjmp. That means side-effects on non-volatile variables done after setjmp, before longjmp, might or might not get rolled back when longjmp restores the call-preserved registers to the saved state in the jmp_buf.

The Linux man page for setjmp(3) lays out the rules:

The compiler may optimize variables into registers, and longjmp() may restore the values of other registers in addition to the stack pointer and program counter. Consequently, the values of automatic variables are unspecified after a call to longjmp() if they meet all the following criteria:

they are local to the function that made the corresponding setjmp() call;

their values are changed between the calls to setjmp() and longjmp(); and

they are not declared as volatile.

From glibc's /usr/include/setjmp.h

// earlier CPP macros to define __THROWNL as __attribute__ ((__nothrow__)) in C++ mode

extern int setjmp (jmp_buf __env) __THROWNL;
extern void longjmp (struct __jmp_buf_tag __env[1], int __val)
     __THROWNL __attribute__ ((__noreturn__));
extern void siglongjmp (sigjmp_buf __env, int __val)
     __THROWNL __attribute__ ((__noreturn__));

There's a bunch of C preprocessor stuff to define a _ version (no-signal setjmp) and so on.

BTW, there is a __builtin_setjmp. But it works somewhat differently: the GCC manual recommends against using it in user code, and the ISO C setjmp/longjump library functions can't be defined in terms of it.

Upvotes: 3

Lundin

Reputation: 214275

First of all, the correct answer to why volatile works in the linked posts is "because the C standard explicitly says so." I don't think the quoted part is true, because C explicitly lists a lot of poorly-defined behavior associated with setjmp/longjmp. The relevant part can be found in C17 7.13.2.1:

All accessible objects have values, and all other components of the abstract machine have state, as of the time the longjmp function was called, except that the values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate.

Even C90 says more or less the same as the above. So the reason why compilers, modern or not, don't need to "fix this" is because C has never required them to. In the example where the quoted comment was posted, the second time that if ( foo != 5 ) is executed, the value of foo is indeterminate (and foo never has its address taken), so strictly speaking that line simply invokes undefined behavior and the compiler can do as it pleases from there - it's a bug created by the application programmer, not the optimizer.

Generally, any application programmer using setjmp.h will get what is coming to them. It is the worst possible form of spaghetti programming.

Upvotes: 2

Special treatment of setjmp/longjmp by compilers

Answers (3)

Related Questions