Reputation: 48
I would like to implement an embedding table with float inputs instead of int32 or 64b. The reason is that instead of words like in a simple RNN, I would like to use percentages. For example in case of a recipe; I may have 1000 or 3000 ingredients; but in every recipe I may have a maximum of 80. The ingredients will be represented in percentage for example: ingredient1=0.2 ingredient2=0.8... etc
my problem is that tensorflow forces me to use integers for my embedding table:
TypeError: Value passed to parameter ‘indices’ has DataType float32 not in list of allowed values: int32, int64
any suggestion? I appreciate your feedback,
example of embedding look up:
inputs = tf.placeholder(tf.float32, shape=[None, ninp], name=“x”)
n_vocab = len(int_to_vocab)
n_embedding = 200 # Number of embedding features
with train_graph.as_default():
embedding = tf.Variable(tf.random_uniform((n_vocab, n_embedding), -1, 1))
embed = tf.nn.embedding_lookup(embedding, inputs)
the error is caused by
inputs = tf.placeholder(**tf.float32,** shape=[None, ninp], name=“x”)
I have thought of an algorithm that could work using loops. But, I was wondering if there is a more direct solution.
Thanks!
Upvotes: 2
Views: 1584
Reputation: 53758
tf.nn.embedding_lookup
can't allow float input, because the point of this function is to select the embeddings at the specified rows.
Example:
Here there are 5 words and 5 embedding 3D vectors, and the operation returns the 3-rd row (with 0-indexing). This is equivalent to this line in tensorflow:
embed = tf.nn.embedding_lookup(embed_matrix, [3])
You can't possibly look up a floating point index, such as 0.2
or 0.8
, because there is no 0.2
and 0.8
row index in the matrix. Highly recommend this post by Chris McCormick about word2vec.
What you describe sounds more like a softmax loss function, which outputs a probability distribution over the target classes.
Upvotes: 2