Mathijs Geelen
Mathijs Geelen

Reputation: 61

dropout(): argument 'input' (position 1) must be Tensor, not tuple when using XLNet with HuggingfCE

I get an error saying that the input should be of type Tensor, not tuple. I do not know how to work around this problem, as I am already implementing the return_dict=False method as stated in the migration plan.

My model is as follows:

class XLNetClassifier(torch.nn.Module):
    def __init__(self, dropout_rate=0.1):
        super(XLNetClassifier, self).__init__()
        self.XLNet = XLNetModel.from_pretrained('xlnet-base-cased', return_dict=False)
        self.d1 = torch.nn.Dropout(dropout_rate)
        self.l1 = torch.nn.Linear(768, 64)
        self.bn1 = torch.nn.LayerNorm(64)
        self.d2 = torch.nn.Dropout(dropout_rate)
        self.l2 = torch.nn.Linear(64, 3)
        
    def forward(self, input_ids, attention_mask):
        x = self.XLNet(input_ids=input_ids, attention_masks = attention_mask)
        x = self.d1(x)
        x = self.l1(x)
        x = self.bn1(x)
        x = torch.nn.Tanh()(x)
        x = self.d2(x)
        x = self.l2(x)
        
        return x

The error occurs when calling the dropout.

Upvotes: 6

Views: 12671

Answers (1)

cronoik
cronoik

Reputation: 19385

The XLNetModel returns two output values:

  • last_hidden_state
  • mems

That means you get a tuple and not a single tensor as the error message says. Your class definition should therefore be:

from transformers import XLNetModel, XLNetTokenizerFast
import torch

class XLNetClassifier(torch.nn.Module):
    def __init__(self, dropout_rate=0.1):
        super(XLNetClassifier, self).__init__()
        self.XLNet = XLNetModel.from_pretrained('xlnet-base-cased', return_dict=False)
        self.d1 = torch.nn.Dropout(dropout_rate)
        self.l1 = torch.nn.Linear(768, 64)
        self.bn1 = torch.nn.LayerNorm(64)
        self.d2 = torch.nn.Dropout(dropout_rate)
        self.l2 = torch.nn.Linear(64, 3)
        
    def forward(self, input_ids, attention_mask):
        x = self.XLNet(input_ids=input_ids, attention_masks = attention_mask)
        x = self.d1(x[0])
        x = self.l1(x)
        x = self.bn1(x)
        x = torch.nn.Tanh()(x)
        x = self.d2(x)
        x = self.l2(x)
        
        return x


tokenizer = XLNetTokenizerFast.from_pretrained('xlnet-base-cased')
model = XLNetClassifier()
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt", return_token_type_ids=False)
outputs = model(**inputs)

or even better without return_dict=False

class XLNetClassifier(torch.nn.Module):
    def __init__(self, dropout_rate=0.1):
        super(XLNetClassifier, self).__init__()
        self.XLNet = XLNetModel.from_pretrained('xlnet-base-cased')
        self.d1 = torch.nn.Dropout(dropout_rate)
        self.l1 = torch.nn.Linear(768, 64)
        self.bn1 = torch.nn.LayerNorm(64)
        self.d2 = torch.nn.Dropout(dropout_rate)
        self.l2 = torch.nn.Linear(64, 3)
        
    def forward(self, input_ids, attention_mask):
        x = self.XLNet(input_ids=input_ids, attention_masks = attention_mask)
        x = self.d1(x.last_hidden_state)
        x = self.l1(x)
        x = self.bn1(x)
        x = torch.nn.Tanh()(x)
        x = self.d2(x)
        x = self.l2(x)
        
        return x

Upvotes: 3

Related Questions