Reputation: 1
I am new to deep learning and keras. I am trying to train a Unet with perceptual loss using keras. I have a problem with the output image color. My input image is color image(RGB).
If I don't preprocess the input image, which means the input is RGB with 0~255. The output is as below: output image(RGB with 0~255) It is darker than the label image.
And I found that the pretrained vgg16 model is using "caffe" weights. And the function keras.applications.vgg16.preprocess_input will change the RGB to BGR and substract the mean value. So I tried to use keras.applications.vgg16.preprocess_input and then deprocess output images by add the mean value then change back to RGB. However the output images are too white: output image(vgg16.preprocess_input)
Then I add MSE loss with the ratio -> 10:1 (perceptual loss: MSE) The output is not different as output image(vgg16.preprocess_input)
I want to know that is this a common problem with perceptual loss or there are something wrong in my code?
Here is my code
preprocess image:
img = load_img(datapath, grayscale = False)
img = img.resize( (resize_size, resize_size), Image.BILINEAR )
img = img_to_array(img)
img = preprocess_input(img)
deprocess image:
mean = [103.939, 116.779, 123.68]
img[..., 0] += mean[0]
img[..., 1] += mean[1]
img[..., 2] += mean[2]
img = img[..., ::-1]
Perceptual loss:
def perceptual_loss(y_true, y_pred):
vgg = VGG16(include_top=False, weights='imagenet', input_shape=(resize_size, resize_size, 3))
loss_model = Model(inputs=vgg.input,
outputs=vgg.get_layer('block3_conv3').output)
loss_model.trainable = False
return K.mean(K.square(loss_model(y_true) - loss_model(y_pred)))
If you have any ideas, please tell me. Thanks a lot!!!
Upvotes: 0
Views: 1691
Reputation: 939
Something I've seen in practice is that, when placing a BatchNormalization layer after a ReLU activation function, output brightness will shift considerably, leading to darker (or in some cases brighter) images.
This makes sense, as ReLU will discard all negative values in the filter maps, causing batch norm to shift the whole thing down, gradually reducing the intensity on each pixel.
Without more details about your model it's hard to tell if this is your case, but I thought of posting this as, when I was having this issue, I couldn't find any information about why this is happening.
Upvotes: 0
Reputation: 86630
The output of "your" model is not at all related with anything regarding VGG, caffe, etc.
It's "you" who define it when you create your model.
So, if your model's outputs must be between 0 and 255, one possibility is to have its last layers as:
Activation('sigmoid')
Lambda(lambda x: x*255)
You'd then need the preprocess_input
function inside perceptual loss:
def perceptual_loss(y_true, y_pred):
y_true = preprocess_input(y_true)
y_pred = preprocess_input(y_pred)
vgg = VGG16(include_top=False, weights='imagenet', input_shape=(resize_size, resize_size, 3))
loss_model = Model(inputs=vgg.input,
outputs=vgg.get_layer('block3_conv3').output)
loss_model.trainable = False
return K.mean(K.square(loss_model(y_true) - loss_model(y_pred)))
Another possibility is to postprocess your model's output. (But again, the range of the output is totally defined by you).
Upvotes: 1