Reputation: 11
I implemented the following tutorial in Keras:
In the intro the author says the setup is good for matching input sequences of random varying sizes to output sequences of random varying sizes. I am confused because I do not see how to generate sentence outputs that are a different length than the input sentence.
Let's assume that the inputs are English sentences and the outputs are French sentences, as in the tutorial.
My current understanding is as follows:
The encoder input is the English sentence as a sequence of integers to be embedded. The decoder input is the french sentence as a sequence of integers delayed one time step, with the first integer in the series representing a null value. This layer is also embedded.
The target is the french sentence as a series of integers, not delayed. I seem to need to add an integer at the end to represent end of field, otherwise the size does not match up with the decoder embedded input and keras throws an error.
When making predictions, what exactly do you feed it? It doesn't seem possible to get outputs a different length than the input. Is that the case?
Upvotes: 0
Views: 1097
Reputation: 836
As far as I understand this paper https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf the idea is that your decoder predicts states (or words) until it sees (predicts) a specific word (e.g. "EOS" which is the abbriviation of end of sequence). That is to my understanding the reason why the output length is not fixed. Of course, your training data has to be appropriate and annotated with the specific "EOS"- tags.
Upvotes: 0