Reputation: 1741
We spend a lot of time in reading the API document of tf.nn.embedding_lookup_sparse. The meaning of embedding_lookup_sparse
is confusing and it seems quite different from embedding_lookup
.
Here's what I think and please correct me if I'm wrong. The example of wide and deep model uses contrib.layers
APIs and call embedding_lookup_sparse
for sparse feature colume. If it gets the SparseTensor(for example, country, which is sparse), it creates the embedding which is actually for one-host encoding. Then call to_weights_sum
to return the result of embedding_lookup_sparse
as prediction
and the embedding as variable
.
The the result of embedding_lookup_sparse
add bias
and become the logits
for loss function and training operation. That means the embedding_lookup_sparse
do something like w * x
(part of y = w * x + b
) for dense tensor.
Maybe for one-hot encoding or SparseTensor, the weight
from embedding_lookup_sparse
is actually the value of w * x
because the look-up data is always 1
and no need to add other 0
s.
What I said is also confusing. Can anyone help to explain this in detail?
Upvotes: 3
Views: 2192
Reputation: 5162
The main difference between embedding lookup and embedding lookup sparse is that the sparse version expects the id's and weights to be of type SparseTensor.
How embedding lookup works:
You pass in a tensor of some size and the embedding_lookup_sparse will multiply the slices of the tensors (slices referenced by sp_ids parameter) by some weight (also passed in as sp_weight; defaults to value 1) then you are returned the new slices.
There is no bias term. You can add the slices of the tensor together by referencing more than one to be included as an element in your output.
Upvotes: 1