Reputation: 607
I am trying to prepare the masks for image segmentation with Pytorch. I have three questions about data preparation.
What is the appropriate data format to save the binary mask in general? PNG? JPEG?
Is the mask size needed to be set square such as (224x224), not a rectangle such as (224x448)?
Is the mask value fixed when the size is converted from rectangle to square?
For example, the original mask image size is (600x900), which is binary [0,1]. However, when I applied
import torchvision.transforms as transforms
transforms.Compose([
transforms.Resize((300, 300)),
transforms.ToTensor(),
])
to the mask, the output had other values: 0.01, 0.0156, 0.22... except for 0 and 1, since the mask size was converted.
I applied the below code to convert the mask into the binary again if the value is less than 0.3, the value is 0, otherwise, 1.
def __getitem__(self, idx):
img, mask = self.load_data(idx)
if self.img_transforms is not None:
img = self.img_transforms(img)
if self.mask_transforms is not None:
mask = self.mask_transforms(mask)
mask = torch.where(mask<=0.3,0,1)
return img, mask
but I wonder the process is a common approach and efficient.
Upvotes: 0
Views: 2064
Reputation: 1690
transforms.Resize((300, 300), interpolation=InterpolationMode.NEAREST)
Upvotes: 1