Jean-François Fabre
Jean-François Fabre

Reputation: 140186

Why am I getting "storage error" on my Ada shared library when running under a JVM

We have an Ada shared library compiled by GnatPro 19.2 that we're calling through a JNA call.

Our application runs fine under windows. When porting it under Linux, the application crashes randomly with an Ada exception:

storage error or erroneous memory access.

Debugging with gdb (attaching the process) doesn't help much. We get various SIGSEGV, we continue, and after a while we get the storage error with no useable call stack.

Our shared library can be used with a python native call without any issues whatsoever. The issue is probably on the Java side.

Tried switching JVM (openjdk or official jdk) without luck.

Why is this? Is there a way to workaround it?

Upvotes: 3

Views: 1171

Answers (1)

Jean-François Fabre
Jean-François Fabre

Reputation: 140186

The first hint is getting a bunch of SIGSEGV when trying to attach a debugger to the application, then seeing the program resuming when continuing.

It means that the SIGSEGV signal is handled on the Java side, as confirmed in Why does java app crash in gdb but runs normally in real life?.

Java uses speculative loads. If a pointer points to addressable memory, the load succeeds. Rarely the pointer does not point to addressable memory, and the attempted load generates SIGSEGV ... which java runtime intercepts, makes the memory addressable again, and restarts the load instruction.

Now what happens, is that by default, the GNAT run-time installs a new signal handler to catch SIGSEGV and redirect to a clean Ada exception. One interesting feature of Ada exceptions is that they can print the stack trace, even without a debugger. This SIGSEGV handler redirection allows this.

But in the case of Java, since Java uses speculative loads, SIGSEGV are expected from time to time on the java side. So when the Ada shared library has been loaded & initialized, the Ada SIGSEGV handler is installed, and catches those "normal" SIGSEGV, and aborts immediately.

Note that it doesn't happen under Windows. The java runtime probably cannot use this speculative load mechanism because of Windows limitations when handling memory violation accesses.

The signal handling is done in s-intman.adb

 --  Check that treatment of exception propagation here is consistent with
  --  treatment of the abort signal in System.Task_Primitives.Operations.

  case signo is
     when SIGFPE  => raise Constraint_Error;
     when SIGILL  => raise Program_Error;
  --   when SIGSEGV => raise Storage_Error;  -- commenting this line should fix it
     when SIGBUS  => raise Storage_Error;
     when others  => null;
  end case;
end Notify_Exception;

Now we'd have to rebuild a new native run-time and use it instead of the default one. That is pretty tedious and error prone. That file is part of gnarl library. We'd have to rebuild the gnarl library dynamically with the proper options -gnatp -nostdinc -O2 -fPIC to create a gnatrl library substitution... and do that again when upgrading the compiler...

Fortunately, an alternate solution was provided by AdaCore:

First create a pragmas file in the .gpr project directory (let's call it no_sigsegv.adc) containing:

pragma Interrupt_State (SIGSEGV, SYSTEM); 

to instruct the run-time not to install the SIGSEGV handler

Then add this to the Compiler package of the .gpr file:

  package Compiler is
    ...
      for local_configuration_pragmas use Project'Project_dir & "/no_sigsegv.adc";

and rebuild everything from scratch. Testing: not a single crash whatsoever.

Upvotes: 7

Related Questions