Reputation: 257
I'm new with neural networks. I wanted to make a custom loss function in TensorFlow, but I need to get a vector of weights, so I did it in this way:
def my_loss(weights):
def custom_loss(y, y_pred):
return weights*(y - y_pred)
return custom_loss
model.compile(optimizer='adam', loss=my_loss(weights), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=None, validation_data=(x_test, y_test), epochs=100)
When I launch it, I receive this error:
InvalidArgumentError: Incompatible shapes: [50000,10] vs. [32,10]
The shapes are:
print(weights.shape)
print(y_train.shape)
(50000, 10)
(50000, 10)
So I thought that it was a problem with the batches, I don't have a strong background with TensorFlow, so I tried to solve in a naive way using a global variable
batch_index = 0
and then updating it within a custom callback into the "on_batch_begin" hook. But it didn't work and it was a horrible solution. So, how can I get the exact part of the weights with the corresponding y? Do I have a way to get the current batch index inside the custom loss? Thank you in advance for your help
Upvotes: 8
Views: 11815
Reputation: 775
Like @Michael Moretti, I too am new at all this (deep learning, Python, TensorFlow, Keras, ...). This question was asked about 19 months ago, and things move fast in “TF years.”
Apparently at some point, you could just write a Python function with arguments (y_true, y_pred)
and pass it to your call to model.compile()
and all was well. Now that seems to work in some simple cases, but not in general. While trying to understand why it was not working for me, I found this SO question and other related ones. It was @M.Innat’s answer to this question that got me on the right track. But in fact his relevant final example CustomMSE
is cribbed from the Keras Guide section on Custom Losses. This example shows both how to write a custom loss fully compatible with TensorFlow version: 2.7.0, as well as how to pass additional parameters to it via the constructor of a class based on keras.losses.Loss
in the call to model.compile()
:
class CustomMSE(keras.losses.Loss):
def __init__(self, regularization_factor=0.1, name="custom_mse"):
super().__init__(name=name)
self.regularization_factor = regularization_factor
def call(self, y_true, y_pred):
mse = tf.math.reduce_mean(tf.square(y_true - y_pred))
reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
return mse + reg * self.regularization_factor
model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE())
For best results, make sure that all computation inside your custom loss function (that is, the call()
method of your custom Loss class) is done with TensorFlow operators, and that all input and output data is represented as TF tensors.
Upvotes: 3
Reputation: 22021
this is a workaround to pass additional arguments to a custom loss function, in your case an array of weights. the trick consists in using fake inputs which are useful to build and use the loss in the correct ways. don't forget that keras handles fixed batch dimension
I provide a dummy example in a regression problem
def mse(y_true, y_pred, weights):
error = y_true-y_pred
return K.mean(K.square(error) + K.sqrt(weights))
X = np.random.uniform(0,1, (1000,10))
y = np.random.uniform(0,1, 1000)
w = np.random.uniform(0,1, 1000)
inp = Input((10,))
true = Input((1,))
weights = Input((1,))
x = Dense(32, activation='relu')(inp)
out = Dense(1)(x)
m = Model([inp,true,weights], out)
m.add_loss( mse( true, out, weights ) )
m.compile(loss=None, optimizer='adam')
m.fit(x=[X, y, w], y=None, epochs=3)
## final fitted model to compute predictions (remove W if not needed)
final_m = Model(inp, out)
Upvotes: 5
Reputation: 1079
Keras allows you to take any tensors from global scope. Actually, y_true
and y_pred
might be even not used, as here.
Your model can have multiple inputs (you can make this input dummy on inference, or load weights into model with single input). Notice, that you still need it for validation.
import keras
from keras.layers import *
from keras import backend as K
import numpy as np
inputs_x = Input(shape=(10,))
inputs_w = Input(shape=(10,))
y = Dense(10,kernel_initializer='glorot_uniform' )(inputs_x)
model = keras.Model(inputs=[inputs_x, inputs_w], outputs=[y])
def my_loss(y_true, y_pred):
return K.abs((y_true-y_pred)*inputs_w)
def my_metrics(y_true, y_pred):
# just to output something
return K.mean(inputs_w)
model.compile(optimizer='adam', loss=[my_loss], metrics=[my_metrics])
data = np.random.normal(size=(50000, 10))
labels = np.random.normal(size=(50000, 10))
weights = np.random.normal(size=(50000, 10))
model.fit([data, weights], labels, batch_size=256, validation_data=([data[:100], weights[:100]], labels[:100]), epochs=100)
To make validation without weights, you need to compile another version of the model with different loss which does not use weights.
UPD: Also notice, that Keras will sum up all the elements of your loss, if it returns array instead of scalar
UPD: Tor tensorflow 2.1.0 things become more complicated, it seems. The way to go is in the direction @marco-cerliani pointed out (labels, weighs and data are fed to the model and custom loss tensor is added via .add_loss()
), however his solution didn't work for me out of the box. The first thing is that model does not want to work with None loss, refusing to take both inputs and outputs. So, I introduced additional dummy loss function. The second problem appeared when dataset size was not divisible by batch size. In keras and tf 1.x last batch problem was usually solved by steps_per_epoch
and validation_steps
parameters, but here if starts to fail on the first batch of Epoch 2. So I needed to make simple custom data generator.
import tensorflow.keras as keras
from tensorflow.keras.layers import *
from tensorflow.keras import backend as K
import numpy as np
inputs_x = Input(shape=(10,))
inputs_w = Input(shape=(10,))
inputs_l = Input(shape=(10,))
y = Dense(10,kernel_initializer='glorot_uniform' )(inputs_x)
model = keras.Model(inputs=[inputs_x, inputs_w, inputs_l], outputs=[y])
def my_loss(y_true, y_pred):
return K.abs((y_true-y_pred)*inputs_w)
def my_metrics():
# just to output something
return K.mean(inputs_w)
def dummy_loss(y_true, y_pred):
return 0.
loss = my_loss(y, inputs_l)
metric = my_metrics()
model.add_loss(loss)
model.add_metric(metric, name='my_metric', aggregation='mean')
model.compile(optimizer='adam', loss=dummy_loss)
data = np.random.normal(size=(50000, 10))
labels = np.random.normal(size=(50000, 10))
weights = np.random.normal(size=(50000, 10))
dummy = np.zeros(shape=(50000, 10)) # or in can be labels, no matter now
# looks like it does not like when len(data) % batch_size != 0
# If I set steps_per_epoch, it fails on the second epoch.
# So, I proceded with data generator
class DataGenerator(keras.utils.Sequence):
'Generates data for Keras'
def __init__(self, x, w, y, y2, batch_size, shuffle=True):
'Initialization'
self.x = x
self.w = w
self.y = y
self.y2 = y2
self.indices = list(range(len(self.x)))
self.shuffle = shuffle
self.batch_size = batch_size
self.on_epoch_end()
def __len__(self):
'Denotes the number of batches per epoch'
return len(self.indices) // self.batch_size
def __getitem__(self, index):
'Generate one batch of data'
# Generate indexes of the batch
ids = self.indices[index*self.batch_size:(index+1)*self.batch_size]
# the last None to remove weird warning
# https://stackoverflow.com/questions/59317919
return [self.x[ids], self.w[ids], self.y[ids]], self.y2[ids], [None]
def on_epoch_end(self):
'Updates indexes after each epoch'
if self.shuffle == True:
np.random.shuffle(self.indices)
batch_size = 256
train_generator = DataGenerator(data,weights,labels, dummy, batch_size=batch_size, shuffle=True)
val_generator = DataGenerator(data[:2*batch_size],weights[:2*batch_size],labels[:2*batch_size], dummy[:2*batch_size], batch_size=batch_size, shuffle=True)
model.fit(x=train_generator, validation_data=val_generator,epochs=100)
Upvotes: 3