Extract word/sentence probabilities from lm_1b trained model

Question

I have successfully downloaded the 1B word language model trained using a CNN-LSTM (https://github.com/tensorflow/models/tree/master/research/lm_1b), and I would like to be able to input sentences or partial sentences to get the probability of each subsequent word in the sentence.

For example, if I have a sentence like, "An animal that says ", I'd like to know the probability of the next word being "woof" vs. "meow".

I understand that running the following produces the LSTM embeddings:

bazel-bin/lm_1b/lm_1b_eval --mode dump_lstm_emb \
                           --pbtxt data/graph-2016-09-10.pbtxt \
                           --vocab_file data/vocab-2016-09-10.txt \
                           --ckpt 'data/ckpt-*' \
                           --sentence "An animal that says woof" \                             
                           --save_dir output

That will produce files lstm_emb_step_*.npy where each file is the LSTM embedding for each word in the sentence. How can I transform these into probabilities over the trained model to be able to compare P(woof|An animal that says) vs. P(meow|An animal that says)?

Thanks in advance.

Extract word/sentence probabilities from lm_1b trained model

Answers (1)

Related Questions