sergey_208
sergey_208

Reputation: 654

how to imply a different loss function for each batch

After getting responded to this question, I realized that I have a different question.

I would like to have a different objective component based on the batch that I am passing during a training step. Suppose my batch size is one and I associate each training data with two supporter vectors that are not part of the training step. So I need to figure out which part of the input vector is currently being processed.

import numpy as np
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model

features = np.random.rand(100, 5)
labels = np.random.rand(100, 2)

holder =  np.random.rand(200, 5) # each feature gets two supporter. 
iter = np.arange(start=1, stop=features.shape[0], step=1)
supporters = {}

for i,j in zip(iter, holder): #(i, i+1) represent the ith training data
    supporters[i]=j

For instance, the first two rows of supporters is for the first point in feature.

features[0]  [0.71444629 0.77256729 0.95375736 0.18759234 0.8207317 ]

has the following two supporters.

1: array([0.76281692, 0.18698215, 0.11687052, 0.78084761, 0.10293403]), 
2: array([0.98229912, 0.08784577, 0.08109571, 0.23665783, 0.52587238])

Now, I create a simple model.

# Simple neural net with three outputs
input_layer = Input((5,))
hidden_layer = Dense(16)(input_layer)
output_layer = Dense(2)(hidden_layer)


# Model
model = Model(inputs=input_layer, outputs=output_layer)

My goal is to create a loss function as

def custom_loss(y_true, y_pred):
    # Normal MSE loss
    mse = K.mean(K.square(y_true-y_pred), axis=-1)
    #Assume that I properly pass model object into the method use the predict method
    #to use the current network weights
    new_constraint = K.sum(y_pred -  model.predict(supporters)) 
    return(mse+new_constraint)

Then, I go ahead and compile my model.

model.compile(loss=custom_loss, optimizer='sgd')
model.fit(features, labels, epochs=1, ,batch_size=1)

The problem is that since the batch size is one, I want to make sure that the loss function only considers the supporter of the current training input. For example, if I am training the third point in features, then I want to use the fifth and sixth vectors while creating new_constraint. How can I accomplish this?

Upvotes: 2

Views: 1061

Answers (1)

Abhishek Prajapat
Abhishek Prajapat

Reputation: 1878

You can implement it like this (I have used the TensorFlow based Keras api but it shouldn't matter)

import numpy as np
import tensorflow as tf
from tensorflow.keras import Input, layers, Model
from tensorflow.keras import backend as K

features = np.random.rand(100, 5)
labels = np.random.rand(100, 2)

supporters =  np.random.rand(200, 5) # each feature gets two supporter. 


# I will get both support vectors to iterate over
supporters_1 = supporters[::2, :]
supporters_2 = supporters[1::2, :]

print(supporters_1.shape, supporters_2.shape)
# Result -> ((100, 5), (100, 5))

# Create a tf dataset to use in training
dataset = tf.data.Dataset.from_tensor_slices(((features, supporters_1, supporters_2), labels)).batch(1)

# A look at what it returns
for i in dataset:
    print(i)
    break

'''
Result:
((<tf.Tensor: shape=(1, 5), dtype=float64, numpy=array([[0.42834492, 0.01041871, 0.53058175, 0.69453215, 0.83901092]])>, 
<tf.Tensor: shape=(1, 5), dtype=float64, numpy=array([[0.1724601 , 0.14386688, 0.49018201, 0.13565471, 0.35159235]])>, 
<tf.Tensor: shape=(1, 5), dtype=float64, numpy=array([[0.87243349, 0.98779049, 0.98405784, 0.74069913, 0.25763667]])>), 
<tf.Tensor: shape=(1, 2), dtype=float64, numpy=array([[0.20993531, 0.70153453]])>)
'''

#=========================================================
# Creating the model (Input size is 5 and not 2 in your sample so I changed it)
# Same for the label shape
input_layer = Input((5,))
hidden_layer = layers.Dense(16)(input_layer)
output_layer = layers.Dense(2)(hidden_layer)

# Model
model = Model(inputs=input_layer, outputs=output_layer)
#=========================================================


# Implementing the custom loss
# Without the `K.abs` the result can be negative and hence the `K.abs`
def custom_loss(y_true, y_pred, support_pred_1, support_pred_2):
    mse = tf.keras.losses.mse(y_true, y_pred)
    new_constraint = K.abs(K.sum(y_pred -  [support_pred_1, support_pred_2]))
    return (mse+new_constraint)

# Instantiate an optimizer.
optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)


'''
Now we create a custom training loop. In this we will get the logits
of all the inputs and then compute loss using the custom loss
function and then optimize on that loss.
'''
epochs = 10
for epoch in range(epochs):
    print("Start of epoch %d" % (epoch,))
    for step, ((features, support_1, support_2), labels) in enumerate(dataset):

        with tf.GradientTape() as tape:
            logits = model(features, training=True)
            logits_1 = model(support_1, training=True)
            logits_2 = model(support_2, training=True)
            
            loss_value = custom_loss(labels, logits, logits_1, logits_2)

        grads = tape.gradient(loss_value, model.trainable_weights)
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

    print('loss_value: ', loss_value)

EDIT: There is another way to do this. As below:

# Everthing same till the supporters_1, supporters_2

def combine(inputs, targets):
    features = inputs[0]
    supports1 = inputs[1]
    supports2 = inputs[2]
    # Stack the inputs as a batch
    final = tf.stack((features, support_1, support_2))
    final = tf.reshape(final, (3,5))
    return final, targets


# Creating the dataset
dataset = tf.data.Dataset.from_tensor_slices(((features, supporters_1, supporters_2), labels)).batch(1)
dataset = dataset.map(combine, num_parallel_calls=-1)

# Check the output
for i in dataset:
    print(i)
    break
'''
(<tf.Tensor: shape=(3, 5), dtype=float64, numpy=
array([[0.35641985, 0.93025517, 0.72874829, 0.81810538, 0.46682277],
       [0.95497516, 0.71722253, 0.10608685, 0.37267656, 0.94748968],
       [0.04822454, 0.00480376, 0.08479184, 0.51133809, 0.38242403]])>, <tf.Tensor: shape=(1, 2), dtype=float64, numpy=array([[0.21399956, 0.97149716]])>)
'''

#================MODEL=================
input_layer = Input((5,))
hidden_layer = layers.Dense(16)(input_layer)
output_layer = layers.Dense(2)(hidden_layer)

# Model
model = Model(inputs=input_layer, outputs=output_layer)
#=======================================

# change the loss function accordingly
'''
The first row in the y_pred will be the prediction corresponding to
actual features and the rest will be predictions corresponding to
supports and hence you can change the loss function as below.
'''
def custom_loss(y_true, y_pred):
    mse = tf.keras.losses.mse(y_true, y_pred[0, :])
    new_constraint = K.abs(K.sum(y_pred[0, :] -  y_pred[1:, :]))
    return (mse+new_constraint)


# Compile
model.compile(loss=custom_loss, optimizer='adam')
# train
model.fit(dataset, epochs=5)

Upvotes: 1

Related Questions