Reputation: 648
I'm trying to impelement this article: http://ronan.collobert.com/pub/matos/2008_deep_icml.pdf Specfically the equation (3) from section 2.
Shortly I want to do a pairwise distance computation for the features of each mini-batch and insert this loss to the general network loss. I have only the Tesnor of the batch (16 samples), the labels tensor of the batch and the batch feature Tensor.
After looking for quite a while I still couldn't figure out the following:
1) How do I divide the batch for Positive (i.e. same label) and negative pairs. Since Tensor are not iterateble I can't figure out how to get which sample have which label and then divide my vector, or get which indices of the tensor belong to each class.
2) How can I do pairwise distance calculation for some of the indices in the batch tensor?
3) I also need to define a new distance function for negative examples
Overall, I need to get which indices belong to which class, do a positive pair-wise distace calculation for all positive pairs. And do another calculation for all negative pairs. Then sum it all up and add it to the network loss.
Any help (to one of more of the 3 issues) would be highly appreciated.
Upvotes: 14
Views: 18021
Reputation: 28208
I think the simplest way to do that is to sample the pairs offline (i.e. outside of the TensorFlow graph).
You create tf.placeholder
for a batch of pairs along with their labels (positive or negative, i.e. same class or different class), and then you can compute in TensorFlow the corresponding loss.
batch_size
pairs of inputs, and output the batch_size
left elements of the pairs of shape [batch_size, input_size]
. You also output the labels of the pairs (either positive of negative) of shape [batch_size,]
pairs_left = np.zeros((batch_size, input_size))
pairs_right = np.zeros((batch_size, input_size))
labels = np.zeros((batch_size, 1)) # ex: [[0.], [1.], [1.], [0.]] for batch_size=4
feed_dict
argument of sess.run()
pairs_left_node = tf.placeholder(tf.float32, [batch_size, input_size])
pairs_right_node = tf.placeholder(tf.float32, [batch_size, input_size])
labels_node = tf.placeholder(tf.float32, [batch_size, 1])
W = ... # shape [input_size, feature_size]
output_left = tf.matmul(pairs_left_node, W) # shape [batch_size, feature_size]
output_right = tf.matmul(pairs_right_node, W) # shape [batch_size, feature_size]
l2_loss_pairs = tf.reduce_sum(tf.square(output_left - output_right), 1)
positive_loss = l2_loss_pairs
negative_loss = tf.nn.relu(margin - l2_loss_pairs)
final_loss = tf.mul(labels_node, positive_loss) + tf.mul(1. - labels_node, negative_loss)
And that's it ! You can now optimize on this loss, with a good offline sampling.
Upvotes: 7
Reputation: 1303
1) You should do the pair sampling before feeding the data into a session. Label every pair a boolean label, say y = 1 for matched-pair, 0 otherwise.
2) 3) Just calculate both pos/neg terms for every pair, and let the 0-1 label y to choose which to add to the loss.
First create placeholders, y_ is for boolean labels.
dim = 64
x1_ = tf.placeholder('float32', shape=(None, dim))
x2_ = tf.placeholder('float32', shape=(None, dim))
y_ = tf.placeholder('uint8', shape=[None]) # uint8 for boolean
Then the loss tensor can be created by the function.
def loss(x1, x2, y):
# Euclidean distance between x1,x2
l2diff = tf.sqrt( tf.reduce_sum(tf.square(tf.sub(x1, x2)),
reduction_indices=1))
# you can try margin parameters
margin = tf.constant(1.)
labels = tf.to_float(y)
match_loss = tf.square(l2diff, 'match_term')
mismatch_loss = tf.maximum(0., tf.sub(margin, tf.square(l2diff)), 'mismatch_term')
# if label is 1, only match_loss will count, otherwise mismatch_loss
loss = tf.add(tf.mul(labels, match_loss), \
tf.mul((1 - labels), mismatch_loss), 'loss_add')
loss_mean = tf.reduce_mean(loss)
return loss_mean
loss_ = loss(x1_, x2_, y_)
Then feed your data (random generated for example):
batchsize = 4
x1 = np.random.rand(batchsize, dim)
x2 = np.random.rand(batchsize, dim)
y = np.array([0,1,1,0])
l = sess.run(loss_, feed_dict={x1_:x1, x2_:x2, y_:y})
Upvotes: 14