Reputation: 4611
I want to export roberta-base
based language model to ONNX
format. The model uses ROBERTA
embeddings and performs text classification task.
from torch import nn
import torch.onnx
import onnx
import onnxruntime
import torch
import transformers
from logs:
17: pytorch: 1.10.2+cu113
18: CUDA: False
21: device: cpu
26: onnxruntime: 1.10.0
27: onnx: 1.11.0
PyTorch
export
batch_size = 3
model_input = {
'input_ids': torch.empty(batch_size, 256, dtype=torch.int).random_(32000),
'attention_mask': torch.empty(batch_size, 256, dtype=torch.int).random_(2),
'seq_len': torch.empty(batch_size, 1, dtype=torch.int).random_(256)
}
model_file_path = os.path.join("checkpoints", 'model.onnx')
torch.onnx.export(da_inference.model, # model being run
model_input, # model input (or a tuple for multiple inputs)
model_file_path, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK,
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input_ids', 'attention_mask', 'seq_len'], # the model's input names
output_names = ['output'], # the model's output names
dynamic_axes={'input_ids': {0 : 'batch_size'},
'attention_mask': {0 : 'batch_size'},
'seq_len': {0 : 'batch_size'},
'output' : {0 : 'batch_size'}},
verbose=True)
I know there maybe problems converting some operators from ATen (A Tensor Library for C++11), if included in model architecture PyTorch Model Export to ONNX Failed Due to ATen.
Exports succeeds if I set the parameter operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK
which means 'leave as is ATen operators if not supported in ONNX'.
PyTorch export function gives me the following warning:
Warning: Unsupported operator ATen. No schema registered for this operator.
Warning: Shape inference does not support models with experimental operators: ATen
It looks like the only ATen operators in the model that are not converted to ONNX are situated inside layers LayerNorm.weight and LayerNorm.bias (I have several layers like that):
%1266 : Float(3, 256, 768, strides=[196608, 768, 1], requires_grad=0, device=cpu) =
onnx::ATen[cudnn_enable=1, eps=1.0000000000000001e-05, normalized_shape=[768], operator="layer_norm"]
(%1265, %model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.weight,
%model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.bias)
# /opt/conda/lib/python3.9/site-packages/torch/nn/functional.py:2347:0
Than model check passes OK:
model = onnx.load(model_file_path)
# Check that the model is well formed
onnx.checker.check_model(model)
# Print a human readable representation of the graph
print(onnx.helper.printable_graph(model.graph))
I also can visualize computation graph using Netron.
But when I try to perform inference using exported ONNX model it stalls with no logs or stdout. So this code will hang the system:
model_file_path = os.path.join("checkpoints", "model.onnx")
sess_options = onnxruntime.SessionOptions()
sess_options.log_severity_level = 0
ort_providers: List[str] = ["CUDAExecutionProvider"] if use_gpu else ['CPUExecutionProvider']
session = InferenceSession(model_file_path, providers=ort_providers, sess_options=sess_options)
Is there any suggestions to overcome this problem? From official documentation I see that torch.onnx models exported this way are probably runnable only by Caffe2
.
This layers are not inside the base frozen roberta model, so this is additional layers that I added by myself. Is it possible to substitute the offending layers with similar ones and retrain the model?
Or Caffe2
is the best choice here and onnxruntime will not do the inference?
Update: I retrained the model on the basis of BERT cased embeddings, but the problem persists. The same ATen operators are not converted in ONNX. It looks like the layers LayerNorm.weight and LayerNorm.bias are only in the model above BERT. So, what is your suggestions to change this layers and enable ONNX export?
Upvotes: 3
Views: 7819
Reputation: 98
Best way to go will be to rewrite the place in the model that uses these operator in a way it will convert look at this for reference. if for example the issue is layer norm then you can write it yourself. another thing that help sometimes is not setting the axes as dynamic, since some op dont support it yet
Upvotes: 1
Reputation: 903
Have you tried to export after defining the operator for onnx? Something along the lines of the following code by Huawei.
On another note, when loading a model, you can technically override anything you want. Putting a specific layer to equal your modified class that inherits the original, keeps the same behavior (input and output) but execution of it can be modified. You can try to use this to save the model with changed problematic operators, transform it in onnx, and fine tune in such form (or even in pytorch).
This generally seems best solved by the onnx team, so long term solution might be to post a request for that specific operator on the github issues page (but probably slow).
Upvotes: 1