Running NLTK sentence_bleu in Pandas

Question

I am trying to apply sentence_bleu to a column in Pandas to rate the quality of some machine translation. But the scores it is outputting are incorrect. Can anyone see my error?

import pandas as pd
from nltk.translate.bleu_score import sentence_bleu

translations = {
    'reference': [['this', 'is', 'a', 'test'],['this', 'is', 'a', 'test'],['this', 'is', 'a', 'test']],
    'candidate': [['this', 'is', 'a', 'test'],['this', 'is', 'not','a', 'quiz'],['I', 'like', 'kitties', '.']]
}
df = pd.DataFrame(translations)

df['BLEU'] = df.apply(lambda row: sentence_bleu(row['reference'],row['candidate']), axis=1)
df

It outputs this:

Index   reference   candidate   BLEU
0   [this, is, a, test] [this, is, a, test] 1.288230e-231
1   [this, is, a, test] [this, is, not, a, quiz]    1.218332e-231
2   [this, is, a, test] [I, like, kitties, .]   0.000000e+00

Row 0 should be equal to 1.0 and row 1 should be less than 1.0. Probably around 0.9. What am I doing wrong?

Running NLTK sentence_bleu in Pandas

Answers (1)

Related Questions