Lucien Ledune
Lucien Ledune

Reputation: 120

Keras custom loss with one of the features used and a condition

I'm trying to make a custom function for the deviance in Keras. Deviance is calculated as : 2 * (log(yTrue) - log(yPred))

The problem here is that my yTrue values are rare event count and therefore often equal to 0, resulting in a -inf error.

The derivation of deviance for my specific case (poisson unscaled deviance) gives a solution to this :

There is two problems here i encounter :

I made a first iteration of a loss function before derivating deviance, adding small values to yTrue when it is equal to 0 to prevent the -Inf. problems, but it gives wrong results for deviance so i have to change it.

def DevianceBis(y_true, y_pred):
y_pred = KB.maximum(y_pred, 0.0 + KB.epsilon()) #make sure ypred is positive or ln(-x) = NAN
return (KB.sqrt(KB.square( 2 * KB.log(y_true + KB.epsilon()) - KB.log(y_pred))))

I'd like to know how to pass the D values into the loss function and how to use an if statement in order to choose the correct expression to use.

Thanks in advance

EDIT :

Tried this one but returns NaN

    def custom_loss(data, y_pred):

        y_true = data[:, 0]
        d = data[:, 1:]
        # condition
        mask = keras.backend.equal(y_true, 0) #i.e. y_true != 0
        mask = KB.cast(mask, KB.floatx())
        # returns 0 when y_true =0, 1 otherwise
        #calculate loss using d...
        loss_value = mask * (2 * d * y_pred) + (1-mask) * 2 * d * (y_true * KB.log(y_true) - y_true * KB.log(y_pred) - y_true + y_pred)
        return loss_value


    def baseline_model():
        # create model
        #building model
        model = keras.Sequential()
        model.add(Dense(5, input_dim = 26, activation = "relu"))
        #model.add(Dense(10, activation = "relu"))
        model.add(Dense(1, activation = "exponential"))
        model.compile(loss=custom_loss, optimizer='RMSProp')
        return model

model = baseline_model()
model.fit(data2, np.append(y2, d, axis = 1), epochs=1, shuffle=True, verbose=1)

EDIT 2 :

def custom_loss(data, y_pred):

    y_true = data[:, 0]
    d = data[:, 1:]
    # condition
    mask2 = keras.backend.not_equal(y_true, 0) #i.e. y_true != 0
    mask2 = KB.cast(mask2, KB.floatx())
    # returns 0 when y_true =0, 1 otherwise
    #calculate loss using d...
    loss_value = 2 * d * y_pred + mask2 * (2 * d * y_true * KB.log(y_true) + 2 * d * y_true * KB.log(y_pred) - 2 * d * y_true)
    return loss_value

EDIT 3 seems to be working without the logs (altough it isn't the result i am looking for) :

def custom_loss(data, y_pred):

    y_true = data[:, 0]
    d = data[:, 1]
    # condition
    mask2 = keras.backend.not_equal(y_true, 0) #i.e. y_true != 0
    mask2 = KB.cast(mask2, KB.floatx())
    # returns 0 when y_true =0, 1 otherwise
    #calculate loss using d...
    loss_value = 2 * d * y_pred #+ mask2 * (2 * d * y_true * KB.log(y_true) + 2 * d * y_true * KB.log(y_pred) - 2 * d * y_true)
    return loss_value


def baseline_model():
    # create model
    #building model
    model = keras.Sequential()
    model.add(Dense(5, input_dim = 26, activation = "relu"))
    #model.add(Dense(10, activation = "relu"))
    model.add(Dense(1, activation = "exponential"))
    model.compile(loss=custom_loss, optimizer='RMSProp')
    return model

model = baseline_model()
model.fit(data2, np.append(y2, d, axis = 1), epochs=1, shuffle=True, verbose=1)

EDIT again :

def custom_loss3(data, y_pred):

    y_true = data[:, 0]
    d = data[:, 1]
    # condition
    loss_value = KB.switch(KB.greater(y_true, 0), 2 * d * y_pred, 2 * d * (y_true * KB.log(y_true + KB.epsilon()) - y_true * KB.log(y_pred + KB.epsilon()) - y_true + y_pred))
    return loss_value

Upvotes: 0

Views: 1254

Answers (2)

Lucien Ledune
Lucien Ledune

Reputation: 120

So here's the final answer ... after days i finally found how to do it.

def custom_loss3(data, y_pred):
        y_true = data[:, 0]
        d = data[:, 1]

        lnYTrue = KB.switch(KB.equal(y_true, 0), KB.zeros_like(y_true), KB.log(y_true))
        lnYPred = KB.switch(KB.equal(y_pred, 0), KB.zeros_like(y_pred), KB.log(y_pred))
        loss_value = 2 * d * (y_true * lnYTrue - y_true * lnYPred[:, 0] - y_true + y_pred[:, 0])
        return loss_value

Calculate the logs before the actual loss and give K.zeros_like instead of it if the value of y_true is 0. Also need to only take the first vector of y_pred since it will return a vector NxN and y_true will return Nx1.

Also had to delete values of d=0 in the data (they wern't of much use anyway).

Upvotes: 2

Anakin
Anakin

Reputation: 2010

If the D is a feature on input vector, you can pad your label with extra D columns from input and write a custom loss. You can pass the extra prediction info w.r.t. your input as a numpy array like this

    def custom_loss(data, y_pred):

        y_true = data[:, 0]
        d = data[:, 1:]
        # condition
        mask = K.not_equal(y_true, 0) #i.e. y_true != 0
        # returns 0 when y_true =0, 1 otherwise
        #calculate loss using d...
        loss_value = mask*(2*d*y_pred) + mask*(2*d*(y_true*ln(y_true) - y_true*ln(y_pred) - y_true + y_pred)
        return loss_value


    def baseline_model():
        # create model
        i = Input(shape=(5,))
        x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
        o = Dense(1, kernel_initializer='normal', activation='linear')(x)
        model = Model(i, o)
        model.compile(loss=custom_loss, optimizer=Adam(lr=0.0005))
        return model


    model.fit(X, np.append(Y_true, d, axis =1), batch_size = batch_size, epochs=90, shuffle=True, verbose=1)

EDIT:

I added the mask for the conditional statement. I am not exactly sure if it will work that way, or do you need to cast it to integer tensors; because the function returns a bool.

Upvotes: 2

Related Questions