Hossam Benhammou
Hossam Benhammou

Reputation: 1

forward(__torch__.transformers.models.bert.modeling_bert.BertModel self, Tensor input_ids) -> ((Tensor, Tensor))

I am working with opensearch and i want to implement a custom model from hugging face i followed there steps on how to save and zip the model, everything works fine until i try to deploy the model and then i get this error:

opensearch-node1       | Caused by: ai.djl.engine.EngineException: forward() Expected a value of type 'Tensor' for argument 'input_ids' but instead found type 'Dict[str, Tensor]'.
opensearch-node1       | Position: 1
opensearch-node1       | Declaration: forward(__torch__.transformers.models.bert.modeling_bert.BertModel self, Tensor input_ids) -> ((Tensor, Tensor))
opensearch-node1       |    at ai.djl.pytorch.jni.PyTorchLibrary.moduleRunMethod(Native Method) ~[?:?]
opensearch-node1       |    at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:53) ~[?:?]
opensearch-node1       |    at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:145) ~[?:?]
opensearch-node1       |    at ai.djl.nn.AbstractBaseBlock.forward(AbstractBaseBlock.java:79) ~[?:?]
opensearch-node1       |    at ai.djl.nn.Block.forward(Block.java:127) ~[?:?]
opensearch-node1       |    at ai.djl.inference.Predictor.predictInternal(Predictor.java:140) ~[?:?]
opensearch-node1       |    at ai.djl.inference.Predictor.batchPredict(Predictor.java:180) ~[?:?]
opensearch-node1       |    at ai.djl.inference.Predictor.predict(Predictor.java:126) ~[?:?]
opensearch-node1       |    at org.opensearch.ml.engine.algorithms.TextEmbeddingModel.warmUp(TextEmbeddingModel.java:53) ~[?:?]
opensearch-node1       |    at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:218) ~[?:?]
opensearch-node1       |    at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:275) ~[?:?]
opensearch-node1       |    ... 14 more
opensearch-node1       | [2024-04-26T14:12:13,789][INFO ][o.o.m.a.d.TransportDeployModelOnNodeAction] [opensearch-node1] deploy model task done VAm-Go8BP7uFuwEn3GFP

here is the script i use to save the model; open search requires torchsrip format zipped with the model tokenizer.

from transformers import BertModel, BertTokenizer, BertConfig
import torch
import transformers

enc = BertTokenizer.from_pretrained("sentence-transformers/LaBSE")

# Tokenizing input text
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = enc.tokenize(text)
tokenizer = transformers.AutoTokenizer.from_pretrained("sentence-transformers/LaBSE")
tokenizer.save_pretrained("./second_attemot/tokenizer")
# # Masking one of the input tokens
masked_index = 8
tokenized_text[masked_index] = "[MASK]"
indexed_tokens = enc.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

# # Creating a dummy input
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]

# # Initializing the model with the torchscript flag
# # Flag set to True even though it is not necessary as this model does not have an LM Head.
config = BertConfig(
    vocab_size_or_config_json_file=32000,
    hidden_size=768,
    num_hidden_layers=12,
    num_attention_heads=12,
    intermediate_size=3072,
    torchscript=True,
)

# # Instantiating the model
model = BertModel(config)

# # The model needs to be in evaluation mode
model.eval()
print(tokens_tensor.size())
print(segments_tensors.size())

# Ensure the number of segments matches the number of [SEP] tokens
num_sep_tokens = indexed_tokens.count(enc.sep_token_id)
segments_ids = [0] * num_sep_tokens + [1] * (len(indexed_tokens) - num_sep_tokens)

# Creating a dummy input
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]

# # If you are instantiating the model with *from_pretrained* you can also easily set the TorchScript flag
model = BertModel.from_pretrained("sentence-transformers/LaBSE", torchscript=True)

# # Creating the trace
traced_model = torch.jit.trace(model, tokens_tensor)
torch.jit.save(traced_model, "./second_attemot/LaBSE.pt")

Upvotes: 0

Views: 32

Answers (0)

Related Questions