Reputation: 5
For the samplers implemented in tensorflow, e.g. tf.nn.fixed_unigram_candidate_sampler. The behavior is not well-defined in the document. For instance, I would expect the labels specified in true_classes will be excluded from the sampling pool, and the sampling will be conducted for each batch. But according to my experiments, neither of above is true.
Consider the following code:
import tensorflow as tf
labels_matrix = tf.reshape(tf.constant([1, 2, 3, 4], dtype=tf.int64), [-1, 1])
sampled_ids, _, _ = tf.nn.fixed_unigram_candidate_sampler(
true_classes = labels_matrix,
num_true = 1,
num_sampled = 1,
unique = True,
range_max = 5,
distortion = 0.0,
unigrams = range(5)
)
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print sess.run([sampled_ids])
The output can be 3, which actually belongs to the set of true classes. - Also, the output has the dimension [1], which basically means that the sampling is only conducted once, not for each batch.
Can someone help to clarify this?
Upvotes: 0
Views: 729
Reputation: 5206
The documentation for fixed_unigram_candidate_sampler does mention that true labels can be sampled. One of the things you've marked as _
in your code is in fact the expected ratio of sampled true labels.
Upvotes: 1