Reputation: 805
I am trying to code a simple Neural machine translation using tensorflow. But I am a little stuck regarding the understanding of the embedding on tensorflow :
tf.contrib.layers.embed_sequence(inputs, vocab_size=target_vocab_size,embed_dim=decoding_embedding_size)
and
dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)
In which case should I use one to another ?
Upvotes: 1
Views: 369
Reputation: 61
I suppose that you're coming from this seq2seq tutorial. Even though this question is starting to get old, I'll try to answer for the people passing by like me:
tf.contrib.layers.embed_sequence
, and it is actually using tf.nn.embedding_lookup
. So it just wraps it, and creates the embedding matrix (tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
) for you. Although this is convenient and less verbose, by using embed_sequence
there doesn't seem to a direct way to access the embeddings. So if you want to, you have to query for the internal variable used as the embedding matrix by using the same name space. I have to admit that the code in the tutorial above is confusing. I even suspect he's using different embeddings in the encoder and the decoder.TrainingHelper
doesn't need the embedding_lookup
as it only forwards the inputs to the decoder, GreedyEmbeddingHelper
does take as a first input the embedding_lookup
as mentioned in the documentation.Upvotes: 2
Reputation: 86
If I understand you correctly, the first question is about the differences between tf.contrib.layers.embed_sequence
and tf.nn.embedding_lookup
.
According to the official docs (https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence),
Typical use case would be reusing embeddings between an encoder and decoder.
I think tf.contrib.layers.embed_sequence
is designed for seq2seq models.
I found the following post:
where @ispirmustafa mentioned:
embedding_lookup doesn't support invalid ids.
Also, in another post: tf.contrib.layers.embed_sequence() is for what?
@user1930402 said:
- When building a neural network model that has multiple gates that take features as input, by using tensorflow.contrib.layers.embed_sequence, you can reduce the number of parameters in your network while preserving depth. For example, it eliminates the need for each gates of the LSTM to perform its own linear projection of features.
- It allows for arbitrary input shapes, which helps the implementation be simple and flexible.
For the second question, sorry that I didn't use TrainingHelper
and can't answer your question.
Upvotes: 1