Ant
Ant

Reputation: 1153

How to change a Pytorch CNN to take color images instead of black and white?

This code I found has a neural net that is setup to take black and white images. (It's a siamese network but that part's not relevant). When I change it to take my images and NOT convert them to black and white I get an error shown below.
I tried changing the first Conv2d, the sixth line down from a 1 to a 3

class SiameseNetwork(nn.Module):
    def __init__(self):
        super(SiameseNetwork, self).__init__()
        self.cnn1 = nn.Sequential(
            nn.ReflectionPad2d(1),
            # was nn.Conv2d(1, 4, kernel_size=3),
            nn.Conv2d(3, 4, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(4),

            nn.ReflectionPad2d(1),
            nn.Conv2d(4, 8, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(8),

            nn.ReflectionPad2d(1),
            nn.Conv2d(8, 8, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(8))

        self.fc1 = nn.Sequential(
            nn.Linear(8*300*300, 500),
            nn.ReLU(inplace=True),

            nn.Linear(500, 500),
            nn.ReLU(inplace=True),

            nn.Linear(500, 5))

    def forward_once(self, x):
        output = self.cnn1(x)
        output = output.view(output.size()[0], -1)
        output = self.fc1(output)
        return output

    def forward(self, input1, input2):
        output1 = self.forward_once(input1)
        output2 = self.forward_once(input2)
        return output1, output2

My error when the images are NOT converted to black and white and remain in color.

RuntimeError: invalid argument 0: Sizes of tensors must match  
except in dimension 0. Got 3 and 1 in dimension 1 at  
/opt/conda/conda-bld/pytorch-nightly_1542963753679/work/aten/src/TH/generic/THTensorMoreMath.cpp:1319

I checked the shapes of the images as arrays (right before they go into the model) as black and white vs in color...

B&W

torch.Size([1, 1, 300, 300])

In color

torch.Size([1, 3, 300, 300])

Here is a link to a Jupyter Notebook of the entire original code I am working with... https://github.com/harveyslash/Facial-Similarity-with-Siamese-Networks-in-Pytorch/blob/master/Siamese-networks-medium.ipynb

EDIT: UPDATE: I seemed to have solved it by converting the images to RBG in the SiameseNetworkDataset part of the code

img0 = img0.convert("L")

changed to

img0 = img0.convert("RGB")

I just had the line commented out before and thought this left it in RGB but it was something else the model didn't understand. Also, the change in the OP was needed.

nn.Conv2d(1, 4, kernel_size=3),

changed to

nn.Conv2d(3, 4, kernel_size=3),

If you'd like to answer with an explaination of what the model is doing that makes it clear I'll give you the green check. Don't really understand nn.Conv2d

Upvotes: 3

Views: 3321

Answers (1)

Umang Gupta
Umang Gupta

Reputation: 16480

The error seems to be in fully connected part below:

self.fc1 = nn.Sequential(
        nn.Linear(8*100*100, 500),
        nn.ReLU(inplace=True),

        nn.Linear(500, 500),
        nn.ReLU(inplace=True),

        nn.Linear(500, 5))

It seems the output of cnn is of shape[8,300,300] and not [8,100,100]

To solve this either, change input image to [n_channel, 100,100] or change the size of input size of fc-layer to 8*300*300

Upvotes: 1

Related Questions