PyTorch to ONNX export, ATen operators not supported, onnxruntime hangs out

Question

I want to export roberta-base based language model to ONNX format. The model uses ROBERTA embeddings and performs text classification task.

from torch import nn
import torch.onnx
import onnx
import onnxruntime
import torch
import transformers

from logs:

17: pytorch: 1.10.2+cu113
18: CUDA: False
21: device: cpu
26: onnxruntime: 1.10.0
27: onnx: 1.11.0

PyTorch export

batch_size = 3
model_input = {
    'input_ids': torch.empty(batch_size, 256, dtype=torch.int).random_(32000),
    'attention_mask': torch.empty(batch_size, 256, dtype=torch.int).random_(2),
    'seq_len':  torch.empty(batch_size, 1, dtype=torch.int).random_(256)
}
model_file_path = os.path.join("checkpoints", 'model.onnx')

torch.onnx.export(da_inference.model,               # model being run
                  model_input,                         # model input (or a tuple for multiple inputs)
                  model_file_path,   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,          # the ONNX version to export the model to
                  operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK,
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input_ids', 'attention_mask', 'seq_len'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input_ids': {0 : 'batch_size'},
                                'attention_mask': {0 : 'batch_size'},
                                'seq_len': {0 : 'batch_size'},
                                'output' : {0 : 'batch_size'}},
                 verbose=True)

I know there maybe problems converting some operators from ATen (A Tensor Library for C++11), if included in model architecture PyTorch Model Export to ONNX Failed Due to ATen.

Exports succeeds if I set the parameter operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK which means 'leave as is ATen operators if not supported in ONNX'.

PyTorch export function gives me the following warning:

Warning: Unsupported operator ATen. No schema registered for this operator.
Warning: Shape inference does not support models with experimental operators: ATen

It looks like the only ATen operators in the model that are not converted to ONNX are situated inside layers LayerNorm.weight and LayerNorm.bias (I have several layers like that):

 %1266 : Float(3, 256, 768, strides=[196608, 768, 1], requires_grad=0, device=cpu) = 
onnx::ATen[cudnn_enable=1, eps=1.0000000000000001e-05, normalized_shape=[768], operator="layer_norm"]
(%1265, %model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.weight,
 %model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.bias)
# /opt/conda/lib/python3.9/site-packages/torch/nn/functional.py:2347:0

Than model check passes OK:

model = onnx.load(model_file_path)
# Check that the model is well formed
onnx.checker.check_model(model)
# Print a human readable representation of the graph
print(onnx.helper.printable_graph(model.graph))

I also can visualize computation graph using Netron.

But when I try to perform inference using exported ONNX model it stalls with no logs or stdout. So this code will hang the system:

model_file_path = os.path.join("checkpoints", "model.onnx")
sess_options = onnxruntime.SessionOptions()
sess_options.log_severity_level = 0
ort_providers: List[str] = ["CUDAExecutionProvider"] if use_gpu else ['CPUExecutionProvider']
session = InferenceSession(model_file_path, providers=ort_providers, sess_options=sess_options)

Is there any suggestions to overcome this problem? From official documentation I see that torch.onnx models exported this way are probably runnable only by Caffe2.

This layers are not inside the base frozen roberta model, so this is additional layers that I added by myself. Is it possible to substitute the offending layers with similar ones and retrain the model?

Or Caffe2 is the best choice here and onnxruntime will not do the inference?

Update: I retrained the model on the basis of BERT cased embeddings, but the problem persists. The same ATen operators are not converted in ONNX. It looks like the layers LayerNorm.weight and LayerNorm.bias are only in the model above BERT. So, what is your suggestions to change this layers and enable ONNX export?

Warkaz · Accepted Answer

Have you tried to export after defining the operator for onnx? Something along the lines of the following code by Huawei.

On another note, when loading a model, you can technically override anything you want. Putting a specific layer to equal your modified class that inherits the original, keeps the same behavior (input and output) but execution of it can be modified. You can try to use this to save the model with changed problematic operators, transform it in onnx, and fine tune in such form (or even in pytorch).

This generally seems best solved by the onnx team, so long term solution might be to post a request for that specific operator on the github issues page (but probably slow).

PyTorch to ONNX export, ATen operators not supported, onnxruntime hangs out

Answers (2)

Related Questions