Datadaddy
Datadaddy

Reputation: 21

Signal Handling on Linux when using Java/JNI

I work on an embedded system running on Wind River Linux.

It is a mix of Java and C++ with some JNI for communication between technologies.

We have build custom error handling so that in the event of any unexpected errors we generate backtraces and other information to help us determine the problem.

This error handling is always done by a C++ component that all other components must register with (so that the appropriate signals handlers can be installed).

So in the case of a Java component we use JNI to communicate with the C++ error hander.

Our test program uses 35 different scenario's to test all the various types of errors (out of memory, unhandled exceptions, access violations stackoverflows, etc) This is done for both a single main thread case and background threads.

All tests work properly with the exception of a Stackoverflow caused in a JNI main thread and background thread.

On Linux a Stackoverflow should generate a SIGSEGV and the installed sigaction should be invoked. But instead we are simply terminating, i.e. the handler does not get called.

If instead of generating a stackoverflow, we directly cause a SIGSEGV (signal 11), our signal handler does get invoked properly.

Note that we also do a LD_PRELOAD on the Oracle (Java) provided libjsig.so, this is supposedly required to correctly install custom signal handlers when using JNI (and if not done, other test cases fail).

Strangely, if I run the test without the LD_PRELOAD, the signal handler does get invoked for this case.

Looking for ideas on how to debug or solve this problem

Upvotes: 2

Views: 2625

Answers (1)

Andrew Henle
Andrew Henle

Reputation: 1

When I had to write JNI code to handle SIGSEGV et al - I had code that had to clean up some file state on abnormal termination - I found it easier to just manually chain a SIGABRT handler and not use libjsig.so at all. The JVM always seemed to terminate abnormally with a SIGABRT - I'd cause a fatal SIGSEGV, which the JVM would handle and translate to a SIGABRT. It didn't seem to matter what I did.

I can't find it in the Oracle documentation right now, but IBM documents JVM signal handling thus:

Errors

The JVM raises a SIGABRT if it detects a condition from which it cannot recover.

A version of my code (abbreviated in order to eliminate the scroll bar):

typedef void ( *sigaction_handler_t )( int, siginfo_t *, void * );
static sigaction_handler_t original_sigabort_handler = NULL;

static void handler( int sig, siginfo_t *info, void *arg )
{
    switch ( sig )
    {
    case SIGABRT:
        //do stuff - stack trace, setrlimit() to generate core file, etc.
        if ( NULL != original_sigabort_handler )
        {
            original_sigabort_handler( sig, info, arg );
        }
        break;
    default:
        break;
    }
}

__attribute(( constructor )) void library_init_code( void )
{
    struct sigaction new_act, old_act;
    memset( &new_act, 0, sizeof( new_act );
    memset( &old_act, 0, sizeof( old_act );
    sigemptyset( &( new_act.sa_mask ) );
    new_act.sa_sigaction = handler;
    new_act.sa_flags = SA_SIGINFO;

    sigaction( SIGABRT, &new_act, &old_act );
    if ( ( old_act.sa_sigaction != ( sigaction_handler_t ) SIG_IGN ) &&
         ( old_act.sa_sigaction != ( sigaction_handler_t ) SIG_DFL ) )
    {
        original_sigabort_handler = old_act.sa_sigaction;
    }
}

Upvotes: 4

Related Questions