Reputation: 101
I understand why this error usually occurs, which is that the input >= embedding_dim.
However in my case the torch.max(inputs) = embedding_dim - 1.
print('inputs: ', src_seq)
print('input_shape: ', src_seq.shape)
print(self.src_word_emb)
inputs: tensor([[10, 6, 2, 4, 9, 14, 6, 2, 5, 0],
[12, 6, 3, 8, 13, 2, 0, 1, 1, 1],
[13, 8, 12, 7, 2, 4, 0, 1, 1, 1]])
input_shape: [3, 10]
Embedding(15, 512, padding_idx=1)
emb = self.src_word_emb(src_seq)
I try to get a transformer model to work and for some reason the encoder embedding only accepts inputs < embedding_dim_decoder, which does not make sense right?
Upvotes: 1
Views: 75
Reputation: 101
Found the error source! In the transformer model the encoder and decoder can be set up to share the same embedding weights. However, I had a translation task with one embedding for the decoder and one embedding for the encoder. In the code it initializes the weights via:
if emb_src_trg_weight_sharing:
self.encoder.src_word_emb.weight = self.decoder.trg_word_emb.weight
Setting emb_src_trg_weight_sharing
to false solved the issue!
Upvotes: 1