Reputation: 3
I am trying to construct a loss function in Keras, in which I am penalizing the minimum distance between the prediction and a set of given values. The problem is I need to calculate the distance between the predicted values and my given values.
Example Code
def custom_loss(y_pred,y_test):
#Given values
centers=K.constant([[-2.5,-1],[-1.25,-2],[.5,-1],[1.5,.25]])
num_centers=K.int_shape(centers)[0]
#Begin constructing distance matrix
height=K.int_shape(y_pred)[0]
i=0
current_center=K.reshape(K.repeat(K.reshape(centers[i,:],[1,-1]),height),[height,2])
current_dist=K.sqrt(K.sum(K.square(y_pred-current_center),axis=1))
#Values of distance matrix for first center
Distance=K.reshape(current_dist,[height,1])
for i in range(1,num_centers):
current_center=K.reshape(K.repeat(K.reshape(centers[i,:],[1,-1]),height),[height,2])
current_dist=K.sqrt(K.sum(K.square(y_pred-current_center),axis=-1))
current_dist=K.reshape(current_dist,[height,1])
#Iteratively concatenate distances of y_pred from remaining centers
Distance=K.concatenate([Distance,current_dist],axis=-1)
#Determine minimum distance from each predicted value to nearest center
A=K.min(A,axis=1)
#Return average minimum distance as loss
return K.sum(A)/float(height)
However, I can't remove the dependence in the function on the first dimension of y_pred, which is variable. I'm using array broadcasting to calculate the difference between y_pred and each of the given values, but I am explicitly broadcasting using the batch size, as I don't know how to do this without using the batch size in Keras. However, this gives an error as the batch size is not explicitly known when constructing the computational graph.
How can I avoid explictly broadcasting? Is there a more effective of calculating this distance matrix, as the current method is very clumsy?
Upvotes: 0
Views: 862
Reputation: 11895
Your loss function could be implemented using implicit broadcasting as follows:
import keras.backend as K
def custom_loss(y_true, y_pred):
centers = K.constant([[-2.5, -1], [-1.25, -2], [.5, -1], [1.5, .25]])
# Expand dimensions to enable implicit broadcasting
y_pred_r = y_pred[:, None, :] # Shape: (batch_size, 1, 2)
centers_r = centers[None, :, :] # Shape: (1, nb_centers, 2)
# Compute minimum distance to centers for each element
distances = K.sqrt(K.sum(K.square(y_pred_r - centers_r), axis=-1)) # Shape=(batch_size, nb_centers)
min_distances = K.min(distances, axis=-1) # Shape=(batch_size,)
# Output average of minimum distances
return K.mean(min_distances)
NOTE: Not tested.
Upvotes: 1