Reputation: 347
I am working on building a LSTM based seq2seq sentence - slots solution.
For instance:
Input sentence: My name is James Bond
Output Slot: O O O B-name I-name
I'm unable to figure out the reason for the below error:
IndexError: index out of range in self
> <ipython-input-37-19283c592e18>(12)<module>()
10 set_trace()
11 inputs = torch.tensor(training_data[0][0])
---> 12 tag_scores = model(inputs)
13 print(tag_scores)
When I try to execute the following code -
class LSTMTagger(nn.Module):
def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size):
super(LSTMTagger, self).__init__()
self.hidden_dim = hidden_dim
self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)
self.lstm = nn.LSTM(embedding_dim, hidden_dim)
self.hidden2tag = nn.Linear(hidden_dim, tagset_size)
def forward(self, sentence):
embeds = self.word_embeddings(sentence)
lstm_out, _ = self.lstm(embeds.view(len(sentence), 1, -1))
tag_space = self.hidden2tag(lstm_out.view(len(sentence), -1))
tag_scores = F.log_softmax(tag_space, dim=1)
return tag_scores
model = LSTMTagger( EMBEDDING_DIM, HIDDEN_DIM, len(vocab2sent), len(vocab2slot))
loss_function = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)
with torch.no_grad():
inputs = torch.tensor(training_data[0][0])
tag_scores = model(inputs)
print(tag_scores)
for epoch in range(300):
for sentence, tags in training_data:
model.zero_grad()
sentence_in = torch.tensor(sentence, dtype=torch.long)
targets = torch.tensor(tags, dtype=torch.long)
tag_scores = model(sentence_in)
loss = loss_function(sentence_in, targets)
loss.backward()
optimizer.step()
with torch.no_grad():
inputs = prepare_sequence(training_data[0][0], vocab2sent)
tag_scores = model(inputs)
print(tag_scores)
My variable values:
vocab2sent - dict with input sentences vocabulary ( word : unique number)
vocab2slot - dict with output vocabulary (slot : unique number)
inputs - tensor([ 229, 1056, 701, 330, 1093, 37, 166, 517, 1150, 1150, 1150, 1150,
1150, 1150, 1150, 1150, 1150, 1150, 1150, 1150, 1150])
Model value during runtime -
LSTMTagger(
(word_embeddings): Embedding(1148, 560)
(lstm): LSTM(560, 560)
(hidden2tag): Linear(in_features=560, out_features=28, bias=True)
)
Upvotes: 2
Views: 7728
Reputation: 501
The vocabulary size for the Embedding layer is 1148: Embedding(1148, 560) but in the inputs you have index 1150. Maybe it is the source of your problem?
Upvotes: 2