aerin
aerin

Reputation: 22634

tf.nn.embedding_lookup - row or column?

This is a very simple question. I'm learning tensorflow and converting my numpy-written code using Tensorflow.

I have word embedding matrix defined U = [embedding_size, vocab_size] therefore each column is the embedding vector of each word.

I converted U into TF like below:

U = tf.Variable(tf.truncated_normal([embedding_size, vocab_size], -0.1, 0.1))

So far, so good.

Now I need to look up each word's embedding for training. I assume it would be

tf.nn.embedding_lookup(U, word_index)

My question is because my embedding is a column vector, I need to look up like this U[:,x[t]] in numpy.

How does TF figure out it needs to return the row OR column by word_index?

What's the default? Row or column? If it's a row vector, then do I need to transpose my embedding matrix?

https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup doesn't mention this. If anyone could point me to right resource, I'd appreciate it.

Upvotes: 1

Views: 949

Answers (2)

mrry
mrry

Reputation: 126154

If params is a single tensor, the tf.nn.embedding_lookup(params, ids) operation treats ids as the indices of rows in params. If params is a list of tensors or a partitioned variable, then ids still correspond to rows in those tensors, but the partition_strategy (either "div" or "mod") determines how the ids map to a particular row.

As Aaron suggests, it will probably be easiest to define your embedding U as having shape [vocab_size, embedding_size], so that you can use tf.nn.embedding_lookup() and related functions.

Alternatively, you can use the axis argument to tf.gather() to select columns from U:

embedding = tf.gather(U, word_index, axis=1)

Upvotes: 1

Aaron
Aaron

Reputation: 2364

U should be vocab_size x embedding_size, the transpose of what you have now.

Upvotes: 1

Related Questions