ThomasVoss
ThomasVoss

Reputation: 1

Using DL4J in web application - RuntimeException: Op [matmul] execution failed

I have a MultiLayerNetwork which is loaded in an application which is loaded on a payara server. The model loads fine but on requesting an output on a INDArray I receive a RuntimeException.

The system I'm working on seems to work fine during the initialization:

org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 4
org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Binary level Generic x86 optimization level AVX512
org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 8
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CPU]; OS: [Linux]
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [16]; Memory: [7,1GB]
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [OPENBLAS]

org.nd4j.linalg.cpu.nativecpu.CpuBackend - Backend build information:
 GCC: "7.5.0"
STD version: 201103L
DEFAULT_ENGINE: samediff::ENGINE_CPU
HAVE_FLATBUFFERS
HAVE_OPENBLAS

org.deeplearning4j.nn.multilayer.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]

Requesting a prediction results in:

ERROR org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner - Failed to execute op matmul. Attempted to execute with 2 inputs, 1 outputs, 2 targs,0 bargs and 3 iargs. Inputs: [(FLOAT,[1,9],c), (FLOAT,[9,400],f)]. Outputs: [(FLOAT,[1,400],f)]. tArgs: [1.0, 0.0]. iArgs: [0, 0, 0]. bArgs: -. Op own name: "81f000d9-6011-4d1f-adc4-af57fb7d11e6" - Please see above message (printed out from c++) for a possible cause of error.

with the following stack trace:

java.lang.RuntimeException: Op [matmul] execution failed
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1561) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6522) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.blas.impl.BaseLevel3.gemm(BaseLevel3.java:62) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.ndarray.BaseNDArray.mmuli(BaseNDArray.java:3194) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.preOutputWithPreNorm(BaseLayer.java:322) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:295) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:343) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:262) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1341) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2453) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2416) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2407) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2394) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2490) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at com.[****].lambda$nnRegression$1(myCustomJavaClass.java:264)

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.getCustomOperations(NativeOpExecutioner.java:1365) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.ops.DynamicCustomOp.opHash(DynamicCustomOp.java:392) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1900) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1540) ~[nd4j-native-1.0.0-M2.jar:?]
        ... 139 more

I have generated a MVP running the DL4J functions in a standalone application and the loading of the model as well as the inference work fine. I would expect the same behavior on the application server.

MultiLayerNetwork net = MultiLayerNetwork.load(nn, false);
List<Integer> obsList = new ArrayList<>();

obsList.add(1);obsList.add(18);
obsList.add(1);obsList.add(24);
obsList.add(1);obsList.add(15);
obsList.add(1);obsList.add(13);
obsList.add(2);

int[] obsArray = obsList.stream().mapToInt(Integer::intValue).toArray();
int[][] flat = new int[][] { obsArray };

INDArray test = Nd4j.create(flat);
INDArray y = net.output(test); // <---- THIS IS WHERE THE ERROR OCCURS

The application is packed in an ear-file, the dl4j depedencies are declared as provided in the POM.

        <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>1.0.0-M2</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-native-platform</artifactId>
            <version>1.0.0-M2</version>
            <scope>provided</scope>
        </dependency>

The server running this application is a Cent OS 8 with a Payara Server 5.193.1 #badassfish (build 275).

Using the same pattern results in the given stacktrace on the application server. I'm wondering what might cause the error.

Upvotes: 0

Views: 103

Answers (0)

Related Questions