karthikeyan
karthikeyan

Reputation: 235

A Variation on Neural Machine Translation

I have been processing this thought in my head for a long time now. So in NMT, We pass in the text in the source language in the encoder seq2seq stage and the language in the target language in the decoder seq2seq stage and the system learns the conditional probabilities for each word occurring with its target language word. Ex: P(word x|previous n-words). We train this by teacher forcing.

But what if I pass in the input sentence again as input to the decoder stage instead of the target sentence. What would it learn in this case? I'm guessing this will learn to predict the most probable next word in the sentence given the previous text right? What are your thoughts

Thanks in advance

Upvotes: 0

Views: 42

Answers (1)

Jindřich
Jindřich

Reputation: 11250

In that case, you would be learning a model that copies the input symbol to the output. It is trivial for the attention mechanism to learn the identity correspondence between the encoder and decoder states. Moreover, RNNs can easily implement a counter. It thus won't provide any realistic estimate of the probability, it will assign most of the probability mass to the corresponding word in the source sentence.

Upvotes: 1

Related Questions