Oscar
Oscar

Reputation: 344

Pytorch build seq2seq MT model ,But how to get the translation results from the output tensor?

I am try to implement my own MT engine, i am following the steps in https://github.com/bentrevett/pytorch-seq2seq/blob/master/1%20-%20Sequence%20to%20Sequence%20Learning%20with%20Neural%20Networks.ipynb

SRC = Field(tokenize=tokenize_en,
            init_token='<sos>',
            eos_token='<eos>',
            lower=True)

TRG = Field(tokenize=tokenize_de,
            init_token='<sos>',
            eos_token='<eos>',
            lower=True)

After training the model,the link only share a way to batch evaluate but i want to try single string and get the translation results. for example i want my model to translate the input "Boys" and get the German translations.

savedfilemodelpath='./pretrained_model/2020-09-27en-de.pth'
model.load_state_dict(torch.load(savedfilemodelpath))
model.eval()
inputstring = 'Boys'
processed=SRC.process([SRC.preprocess(inputstring)]).to(device)
output=model(processed,processed)
output_dim = output.shape[-1]
outputs = output[1:].view(-1, output_dim)
for item in outputs:
    print('item shape is {} and item.argmax is {}, and words is {}'.format(item.shape,item.argmax(),TRG.vocab.itos[item.argmax()]))

So my question is that it it right to get the translation results by:
First: convert the string to tensor

inputstring = 'Boys'
processed=SRC.process([SRC.preprocess(inputstring)]).to(device)

Second: send the tensor to the model. As the model have a TRG param.I have to give the tensor,am i able not given the TRG tensor?

output=model(processed,processed)
output_dim = output.shape[-1]
outputs = output[1:].view(-1, output_dim)

Third:through the return tensor, i use the argmax to get the translation results? is it right?

Or how can i get the right translation results?

for item in outputs:
        print('item shape is {} and item.argmax is {}, and words is {}'.format(item.shape,item.argmax(),TRG.vocab.itos[item.argmax()+1]))

Upvotes: 0

Views: 158

Answers (1)

Oscar
Oscar

Reputation: 344

i got the answer from the translate_sentence.really thanks @Aladdin Persson

def translate_sentence(model, sentence, SRC, TRG, device, max_length=50):
    # print(sentence)

    # sys.exit()
    # Create tokens using spacy and everything in lower case (which is what our vocab is)
    if type(sentence) == str:
        tokens = [token.text.lower() for token in spacy_en(sentence)]
    else:
        tokens = [token.lower() for token in sentence]

    # print(tokens)

    # sys.exit()
    # Add <SOS> and <EOS> in beginning and end respectively
    tokens.insert(0, SRC.init_token)
    tokens.append(SRC.eos_token)

    # Go through each english token and convert to an index
    text_to_indices = [SRC.vocab.stoi[token] for token in tokens]

    # Convert to Tensor
    sentence_tensor = torch.LongTensor(text_to_indices).unsqueeze(1).to(device)

    # Build encoder hidden, cell state
    with torch.no_grad():
        hidden, cell = model.encoder(sentence_tensor)

    outputs = [TRG.vocab.stoi["<sos>"]]

    for _ in range(max_length):
        previous_word = torch.LongTensor([outputs[-1]]).to(device)

        with torch.no_grad():
            output, hidden, cell = model.decoder(previous_word, hidden, cell)
            best_guess = output.argmax(1).item()

        outputs.append(best_guess)

        # Model predicts it's the end of the sentence
        if output.argmax(1).item() == TRG.vocab.stoi["<eos>"]:
            break

    translated_sentence = [TRG.vocab.itos[idx] for idx in outputs]

    # remove start token
    return translated_sentence[1:]

And the translation is not generated once. Acutualy it generate once a time and use several times.

Upvotes: 0

Related Questions