Using RNN tensorflow language model to predict the probabilities of test sentences

I was able to train a language model using the tensorflow tutorials , the models are saved as checkpoint files as per the code given here.

save_path = saver.save(sess, "/tmp/model.epoch.%03d.ckpt" % (i + 1))

Now I need to restore the checkpoint and use it in the following code:

    def run_epoch(session, m, data, eval_op, verbose=False):
  """Runs the model on the given data."""
  epoch_size = ((len(data) // m.batch_size) - 1) // m.num_steps
  start_time = time.time()
  costs = 0.0
  iters = 0
  state = m.initial_state.eval()
  for step, (x, y) in enumerate(reader.ptb_iterator(data, m.batch_size,
                                                    m.num_steps)):
    cost, state, _ = session.run([m.cost, m.final_state, eval_op],
                                 {m.input_data: x,
                                  m.targets: y,
                                  m.initial_state: state})
    costs += cost
    iters += m.num_steps

    if verbose and step % (epoch_size // 10) == 10:
      print("%.3f perplexity: %.3f speed: %.0f wps" %
            (step * 1.0 / epoch_size, np.exp(costs / iters),
             iters * m.batch_size / (time.time() - start_time)))

  return np.exp(costs / iters)

I cannot find any way of encoding the test sentences and getting sentence probability output from the trained checkpoint model.

The tutorials mention following code:

 probabilities = tf.nn.softmax(logits)

but that it is for training and I cannot figure out how do I get the actual probabilities. I should Ideally get something like :

>>getprob('this is a temp sentence')
>>0.322

Upvotes: 4

Answers (3)

Matthew

Reputation: 79

There should be start symbol (SOS, or something else) and end symbol (EOS, or something else) symbols in the vocab and you can get the index of end symbol then get the corresponding probability value in proba.

Upvotes: 0

Lerner Zhang

Reputation: 7130

You should first know how to calculate the score. Thanks to the Markov Assumption we need not to calculate too much(based on the chain rule). What should be worked out is just the probabilities of the following few words(let's say one for convenience). Then the key becomes how to calculate the rate of the next word.

probab = session.run(myours.proba], feed_dict) # only the input is needed

You should create a model named myours as described in @Romain's answer(mine is just a complementation of that). And create your own ptb_iterator to yield only the x (first you should use raw_input or others to get your words for example in a loop).

for i in range(epoch_size):                   
    x = data[:, i*num_steps:(i+1)*num_steps]  
    yield x # the old one with y is better, use y to locate the probability of the coming word

Now that you have the probability you can do everything the language model can do. For example predict the next word.

list(probab[0]).index(max(probab[0])) #  an id_to_word dict should be created

You will get n-1 scores(more precisely n-1 probability distribution of the vocabulary length, and you should choose one based on the index of the coming word) for an n-word sentence.

I use this way to calculate the score(not certain if it's right or wrong yet, I encounter the same problem as this one):

pro += math.log(list(probab[0])[y[0]], 2)

PS:

To save your time you can save the variables in the session the first time you train the network and restore it every time you want to have a test yourself.

save_path = saver.save( session, "./tmp/model.epoch.%03d.ckpt" % (i + 1)) saver.restore(session, "./tmp/model.epoch.013.ckpt") # only last one
a sentence to ids function is also needed:

return [word_to_id[word] if word in word_to_id else word_to_id["<unk>"] for word in nltk.tokenize.word_tokenize(sentence)]

Hope that this helps and is explanatory enough to answer your question.

Upvotes: 1

Romain

Reputation: 771

I had the same question and I think I found a way around it but I am not an expert so comments are welcomed!

In the PTBModel class, you need to add this line:

    self._proba = tf.nn.softmax(logits)

before (or within) this loop:

    if not is_training:
        return

and also add this property:

      @property
      def proba(self):
          return self._proba

Now in the run_epoch function you can get the probabilities using something like:

    cost, state, proba, _ = session.run([m.cost, m.final_state, m.proba, eval_op],...

From here you should have access to all the probabilities with proba. There may be a better way ... hope this help !

Upvotes: 4

Using RNN tensorflow language model to predict the probabilities of test sentences

Answers (3)

Related Questions