Shiro
Shiro

Reputation: 805

Tensorflow Embedding for training and inference

I am trying to code a simple Neural machine translation using tensorflow. But I am a little stuck regarding the understanding of the embedding on tensorflow :

and

 dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
 dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

In which case should I use one to another ?

Upvotes: 1

Views: 369

Answers (2)

Ghazi Felhi
Ghazi Felhi

Reputation: 61

I suppose that you're coming from this seq2seq tutorial. Even though this question is starting to get old, I'll try to answer for the people passing by like me:

  • For the first question, I looked at the source code behind tf.contrib.layers.embed_sequence, and it is actually using tf.nn.embedding_lookup. So it just wraps it, and creates the embedding matrix (tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))) for you. Although this is convenient and less verbose, by using embed_sequence there doesn't seem to a direct way to access the embeddings. So if you want to, you have to query for the internal variable used as the embedding matrix by using the same name space. I have to admit that the code in the tutorial above is confusing. I even suspect he's using different embeddings in the encoder and the decoder.
  • For the second question:
  • I guess it is equivalent to use a sequence length or an embedding.
  • The TrainingHelper doesn't need the embedding_lookup as it only forwards the inputs to the decoder, GreedyEmbeddingHelper does take as a first input the embedding_lookup as mentioned in the documentation.

Upvotes: 2

gnetmil
gnetmil

Reputation: 86

If I understand you correctly, the first question is about the differences between tf.contrib.layers.embed_sequence and tf.nn.embedding_lookup.

According to the official docs (https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence),

Typical use case would be reusing embeddings between an encoder and decoder.

I think tf.contrib.layers.embed_sequence is designed for seq2seq models.

I found the following post:

where @ispirmustafa mentioned:

embedding_lookup doesn't support invalid ids.

Also, in another post: tf.contrib.layers.embed_sequence() is for what?

@user1930402 said:

  1. When building a neural network model that has multiple gates that take features as input, by using tensorflow.contrib.layers.embed_sequence, you can reduce the number of parameters in your network while preserving depth. For example, it eliminates the need for each gates of the LSTM to perform its own linear projection of features.
  2. It allows for arbitrary input shapes, which helps the implementation be simple and flexible.

For the second question, sorry that I didn't use TrainingHelper and can't answer your question.

Upvotes: 1

Related Questions