Are long sentences not good for deep learning models?

Interested to know if long sentences are good for tensor2tensor model training. And why or why not?

Upvotes: 0

Answers (1)

Martin Popel

Reputation: 2670

Ideally, the training data should have the same distribution of sentence lengths as the target test data. E.g. in machine translation, if long sentences are intended to be translated by the final model, similarly long sentences should be used also for training. The Transformer model seems to not generalize to longer sentences than were used for training, but limiting the maximum sentence length in training allows to use higher batch sizes, which is helpful (Popel and Bojar, 2018).

Upvotes: 1

Are long sentences not good for deep learning models?

Answers (1)

Related Questions