Kevin Johnston
Kevin Johnston

Reputation: 3

Keras loss function dependent on batch size

I am trying to construct a loss function in Keras, in which I am penalizing the minimum distance between the prediction and a set of given values. The problem is I need to calculate the distance between the predicted values and my given values.

Example Code

def custom_loss(y_pred,y_test):


    #Given values
    centers=K.constant([[-2.5,-1],[-1.25,-2],[.5,-1],[1.5,.25]])
    num_centers=K.int_shape(centers)[0]


    #Begin constructing distance matrix
    height=K.int_shape(y_pred)[0]
    i=0
    current_center=K.reshape(K.repeat(K.reshape(centers[i,:],[1,-1]),height),[height,2])
    current_dist=K.sqrt(K.sum(K.square(y_pred-current_center),axis=1))


    #Values of distance matrix for first center
    Distance=K.reshape(current_dist,[height,1])


    for i in range(1,num_centers):
        current_center=K.reshape(K.repeat(K.reshape(centers[i,:],[1,-1]),height),[height,2])
        current_dist=K.sqrt(K.sum(K.square(y_pred-current_center),axis=-1))
        current_dist=K.reshape(current_dist,[height,1])


        #Iteratively concatenate distances of y_pred from remaining centers
        Distance=K.concatenate([Distance,current_dist],axis=-1)

    #Determine minimum distance from each predicted value to nearest center
    A=K.min(A,axis=1)


    #Return average minimum distance as loss
    return K.sum(A)/float(height)

However, I can't remove the dependence in the function on the first dimension of y_pred, which is variable. I'm using array broadcasting to calculate the difference between y_pred and each of the given values, but I am explicitly broadcasting using the batch size, as I don't know how to do this without using the batch size in Keras. However, this gives an error as the batch size is not explicitly known when constructing the computational graph.

How can I avoid explictly broadcasting? Is there a more effective of calculating this distance matrix, as the current method is very clumsy?

Upvotes: 0

Views: 862

Answers (1)

rvinas
rvinas

Reputation: 11895

Your loss function could be implemented using implicit broadcasting as follows:

import keras.backend as K


def custom_loss(y_true, y_pred):
    centers = K.constant([[-2.5, -1], [-1.25, -2], [.5, -1], [1.5, .25]])

    # Expand dimensions to enable implicit broadcasting
    y_pred_r = y_pred[:, None, :]  # Shape: (batch_size, 1, 2)
    centers_r = centers[None, :, :]  # Shape: (1, nb_centers, 2)

    # Compute minimum distance to centers for each element
    distances = K.sqrt(K.sum(K.square(y_pred_r - centers_r), axis=-1))  # Shape=(batch_size, nb_centers)
    min_distances = K.min(distances, axis=-1)  # Shape=(batch_size,)

    # Output average of minimum distances
    return K.mean(min_distances)

NOTE: Not tested.

Upvotes: 1

Related Questions