Jonas Pirner
Jonas Pirner

Reputation: 152

custom class-wise loss function in tensorflow

For my problem, I want to predict customer review scores ranging from 1 to 5. I thought it would be good to implement this as a regression problem because a predicted 1 from the model while 5 being the true value should be a "worse" prediction than 4. It is also wished, that the model performs somehow equally good for all review score classes. Because my dataset is highly unbalanced I want to create a metric/loss that is capable of capturing this (I think just as F1 for classification). Therefore I created following metric (for now just mse is relevant):

def custom_metric(y_true, y_pred):
    df = pd.DataFrame(np.column_stack([y_pred, y_true]), columns=["Predicted", "Truth"])
    class_mse = 0
    #class_mae = 0
    print("MAE for Classes:")
    for i in df.Truth.unique():
        temp = df[df["Truth"]==i]
        mse = mean_squared_error(temp.Truth, temp.Predicted)
        #mae = mean_absolute_error(temp.Truth, temp.Predicted)
        print("Class {}: {}".format(i, mse))
        class_mse += mse
        #class_mae += mae
    print()
    print("AVG MSE over Classes {}".format(class_mse/len(df.Truth.unique())))
    #print("AVG MAE over Classes {}".format(class_mae/len(df.Truth.unique())))

Now an example prediction:

import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error, mean_absolute_error

# sample predictions: "model" messed up at class 2 and 3 
y_true = np.array((1,1,1,2,2,2,3,3,3,4,4,4,5,5,5))
y_pred = np.array((1,1,1,2,2,3,5,4,3,4,4,4,5,5,5))

custom_metric(y_true, y_pred)

Now my question: Is it able to create a custom tensorflow loss function which is able to act in a similar behaviour? I also worked on this implementation which is not yet ready for tensorflow but maybe more alike:

def custom_metric(y_true, y_pred):
    mse_class = 0
    num_classes = len(np.unique(y_true))
    stacked = np.vstack((y_true, y_pred))
    for i in np.unique(stacked[0]):     
        y_true_temp = stacked[0][np.where(stacked[0]==i)]
        y_pred_temp = stacked[1][np.where(stacked[0]==i)]
        mse = np.mean(np.square(y_pred_temp - y_true_temp))
        mse_class += mse
    return mse_class/num_classes

But still, I am not sure how to work around the for loop for a tensorflow like definition.

Thanks in advance for any help!

Upvotes: 0

Views: 290

Answers (1)

Timbus Calin
Timbus Calin

Reputation: 14993

The for loop should be dealt with exactly by means of numpy/tensorflow operations on a tensor.

A custom metric example would be:

  from keras import backend as K

  def custom_mean_squared_error(y_true, y_pred):
        return K.mean(K.square(y_pred - y_true), axis=-1)

where y_true is the ground truth label, y_pred are your predictions. You can see there are not explicit for-loops.

The motivation for not using for loops is that vectorized operations (which are present both in numpy and tensorflow) take advantage of the modern CPU architectures, turning multiple iterative operations into matrix ones. Consider that a dot-product implementation in numpy takes approximately 30 times less than a regular for-loop in Python.

Upvotes: 1

Related Questions