duynguyen236
duynguyen236

Reputation: 45

How do we generate the first target words in machine translation?

I am learning about machine translation tasks with transformers. To my knowledge, the transformers model predicts the next word of the target sentence based on the previous words of the source sentence. However, in the MarianMT model (or T5), I find its tokenizer does not have a start of sentence token (<cls> or <s>). I think that a token is needed to start predicting the first word in the target sentence.

Can anyone explain to me how the MarianMT model will predict the first word in the target sentence?

Thank you.

Upvotes: 1

Views: 146

Answers (1)

Bram Vanroy
Bram Vanroy

Reputation: 28505

From the documentation:

the model starts generating with pad_token_id (which has 0 as a token_embedding) as the prefix (Bart uses <s/>)

So it does not need a SOS token as it uses the padding token as a first token during training.

Upvotes: 1

Related Questions