Reputation: 53
I have successfully downloaded the 1B word language model trained using a CNN-LSTM (https://github.com/tensorflow/models/tree/master/research/lm_1b), and I would like to be able to input sentences or partial sentences to get the probability of each subsequent word in the sentence.
For example, if I have a sentence like, "An animal that says ", I'd like to know the probability of the next word being "woof" vs. "meow".
I understand that running the following produces the LSTM embeddings:
bazel-bin/lm_1b/lm_1b_eval --mode dump_lstm_emb \
--pbtxt data/graph-2016-09-10.pbtxt \
--vocab_file data/vocab-2016-09-10.txt \
--ckpt 'data/ckpt-*' \
--sentence "An animal that says woof" \
--save_dir output
That will produce files lstm_emb_step_*.npy
where each file is the LSTM embedding for each word in the sentence. How can I transform these into probabilities over the trained model to be able to compare P(woof|An animal that says)
vs. P(meow|An animal that says)
?
Thanks in advance.
Upvotes: 4
Views: 499
Reputation: 4907
I wanted to do the same thing and this is what I came up with, adapted from some of their demo code. I'm not entirely sure this is correct but it seems to produce reasonable values.
def get_probability_of_next_word(sess, t, vocab, prefix_words, query):
"""
Return the probability of the given word based on the sequence of prefix
words.
:param sess: Tensorflow session object
:param t: Tensorflow ??? object
:param vocab: Vocabulary model, maps id <-> string, stores max word chard id length
:param list prefix_words: List of words that appear before this one.
:param str query: The query word
"""
targets = np.zeros([BATCH_SIZE, NUM_TIMESTEPS], np.int32)
weights = np.ones([BATCH_SIZE, NUM_TIMESTEPS], np.float32)
if not prefix_words or prefix_words[0] != "<S>":
prefix_words.insert(0, "<S>")
prefix = [vocab.word_to_id(w) for w in prefix_words]
prefix_char_ids = [vocab.word_to_char_ids(w) for w in prefix_words]
inputs = np.zeros([BATCH_SIZE, NUM_TIMESTEPS], np.int32)
char_ids_inputs = np.zeros(
[BATCH_SIZE, NUM_TIMESTEPS, vocab.max_word_length], np.int32)
inputs[0, 0] = prefix[0]
char_ids_inputs[0, 0, :] = prefix_char_ids[0]
softmax = sess.run(t['softmax_out'],
feed_dict={t['char_inputs_in']: char_ids_inputs,
t['inputs_in']: inputs,
t['targets_in']: targets,
t['target_weights_in']: weights})
return softmax[0, vocab.word_to_id(query)]
Example usage
vocab = CharsVocabulary(vocab_path, MAX_WORD_LEN)
sess, t = LoadModel(model_path, ckptdir + "/ckpt-*")
result = get_probability_of_next_word(sess, t, vocab, ["Hello", "my", "friend"], "for")
gives a result of 8.811023e-05
. Note that CharsVocabulary
and LoadModel
are very slightly adapted from the ones in the repo.
Also note that this function is very slow. Maybe someone knows how to improve it.
Upvotes: 0