Reputation: 61
I use the HuggingFace's Transformers library for building a sequence-to-sequence model based on BART and T5. I carefully read the documentation and the research paper and I can't find what the input to the decoder (decoder_input_ids) should be for sequence-to-sequence tasks.
Should decoder input for both models (BART and T5) be same as lm_labels (output of the LM head) or should it be same as input_ids (input to the encoder)?
Upvotes: 6
Views: 8771
Reputation: 61
The decoder_input_ids (optional) corresponds to labels, and labels are the preferred way to provide decoder_input_ids. https://huggingface.co/transformers/glossary.html#decoder-input-ids
This is because internally if decoder_input_ids are None, they will be derived by shifting labels to the right, so you don't have to do the shifting yourself.
Upvotes: 1