Reputation: 1135
I am trying to do pose-conditioned face generation. The idea is to use an auto-encoding network that takes an image as an input and a conditional vector containing the pose information, such that the generated image is conditioned on the conditional vector.
For computing the loss using MSE, I am generating the face landmarks of the predicted image using a face-landmark detection library. Unfortunately, during the early epochs, the network produces garble and the face-landmark detection library returns None
instead of an expected tensor of the form 256 x 256 x 3
where a pixel value indicates the presence of a landmark.
What I would like to do is to ignore computing the loss when no face has been detected.
Example -- assume that my batch is of the form -> 10 x 256 x 256 x 3
, where 10 is the batch_size
and 256x256
is the dimension of the image with 3
channels. For the predictions, let's assume that no landmarks could be generated for 3 images in the batch. I could set the prediction tensor for which no face landmarks could be generated to NaN
values and the predicted landmarks would have the form - 10 x 256 x 256 x 3
. In my MSE loss function, I would like to ignore gradient computation originating from tensor containing the NaN
values. To make life simple, I want to ignore those individual 3 tensors that had all NaN
values.
Any help would be appreciated. I have a backup with a for
loop but that is sub-optimal.
This is some sample code -
import numpy as np
import torch
from torchvision import transforms
import random
transform = transforms.Compose(
[
transforms.ToPILImage(),
transforms.CenterCrop(size),
transforms.ToTensor(),
]
)
image_tensors = torch.randn(10, 3, 256, 256)
tensor_meshes = list()
for image_tensor in image_tensors:
image_array = image_tensor.detach().cpu().numpy()
image_landmarks = generate_mesh_from_image(image_array) # returns either None or landmark image of dimension -> 3 x 256 x 256
if image_landmarks is None:
image_landmarks = np.empty(3, 256, 256) # creates np matrix with Nan values
# convert to tensor
landmark_tensor = transform(image_landmarks)
tensor_meshes.append(landmark_tensor)
# convert tensor_meshes to a torch tensor of dimension -> 10 x 3 x 256 x 256
## SAMPLE CODE ##
# some dummy code to simulate generating landmark images
def generate_mesh_from_image(image_array):
rand = random.randint(0, 1)
if rand == 0:
return None
else:
return np.random.randn(3, 256, 256)
The loss now needs to be computed between the prediction tensor (10 x 3 x 256 x 256) and the ground truth tensor (10 x 3 x 256 x 256). However, the prediction tensor contains some tensor elements that have all NaN values which I would like ignore during the loss computation.
Upvotes: 0
Views: 1154
Reputation: 40768
You can create a mask containing ones for images with landmarks and zeros for those without. Then simply compute the loss on the whole batch and apply the mask afterwards, before performing back propagation.
Upvotes: 0