Reputation: 574
I need to run a encoder-decoder model in Tensorflow. I see that using the available APIs basic_rnn_seq2seq(encoder_input_data, decoder_input_data, lstm_cell)
etc, a encoder-decoder system can be created.
encoder_input_data
is a list of 2D Tensor of size batch_size x
input_size. How can each word be represented using its respective word embedding in this setup? Even embedding_rnn_seq2seq
internally extracts the embeddings. How to give pre-calculated word embeddings as input? Upvotes: 0
Views: 830
Reputation: 1889
First question: Probably not the best way, but what I did was, after building the model, before training starts:
for v in tf.trainable_variables():
if v.name == 'embedding_rnn_seq2seq/RNN/EmbeddingWrapper/embedding:0':
assign_op = v.assign(my_word2vec_matrix)
session.run(assign_op) # or `assign_op.op.run()`
my_word2vec_matrix is a matrix of shape vocabularysize x embedding size and filled in my precomputed embedding-vectors. Use this (or something similar) if you believe your embeddings are really good. Otherwise the seq2seq-Model, over time, will come up with its own trained embedding.
Second question: In seq2seq.py there is a call to model_with_buckets() which you can find in python/ops/seq2seq.py. From there the loss is returned.
Third question: In the test case each decoder input is the decoder output from the timestep before (i.e. the first decoder input is a special GO-symbol, the second decoder input is the decoder output of the first timestep, the third decoder input is the decoder output of the second timestep, and so on)
Upvotes: 3