Reputation: 1182
I know this is because the shapes don't match for the multiplication, but why when my code is similar to most example code I found:
import torch.nn as nn
...
#input is a 256x256 image
num_input_channels = 3
self.encoder = nn.Sequential(
nn.Conv2d(num_input_channels*2**0, num_input_channels*2**1, kernel_size=3, padding=1, stride=2), #1 6 128 128
nn.Tanh(),
nn.Conv2d(num_input_channels*2**1, num_input_channels*2**2, kernel_size=3, padding=1, stride=2), #1 12 64 64
nn.Tanh(),
nn.Conv2d(num_input_channels*2**2, num_input_channels*2**3, kernel_size=3, padding=1, stride=2), #1 24 32 32
nn.Tanh(),
nn.Conv2d(num_input_channels*2**3, num_input_channels*2**4, kernel_size=3, padding=1, stride=2), #1 48 16 16
nn.Tanh(),
nn.Conv2d(num_input_channels*2**4, num_input_channels*2**5, kernel_size=3, padding=1, stride=2), #1 96 8 8
nn.Tanh(),
nn.Conv2d(num_input_channels*2**5, num_input_channels*2**6, kernel_size=3, padding=1, stride=2), #1 192 4 4
nn.LeakyReLU(),
nn.Conv2d(num_input_channels*2**6, num_input_channels*2**7, kernel_size=3, padding=1, stride=2), #1 384 2 2
nn.LeakyReLU(),
nn.Conv2d(num_input_channels*2**7, num_input_channels*2**8, kernel_size=2, padding=0, stride=1), #1 768 1 1
nn.LeakyReLU(),
nn.Flatten(),
nn.Linear(768, 1024*32),
nn.ReLU(),
nn.Linear(1024*32, 256),
nn.ReLU(),
).cuda()
I get the error "RuntimeError: mat1 and mat2 shapes cannot be multiplied (768x1 and 768x32768)"
To my understanding I should end up with a Tensor of shape [1,768,1,1] after the convolutions and [1,768] after flattening, so I can use a fully connected Linear layer that goes to 1024*32 in size (by which I tried to add some more ways for the neural net to store data/knowledge).
Using nn.Linear(1,1024*32)
runs with a warning later: "UserWarning: Using a target size (torch.Size([3, 256, 256])) that is different to the input size (torch.Size([768, 3, 256, 256]))". I think it comes from my decoder, though
What am I not understanding correctly here?
Upvotes: 0
Views: 239
Reputation: 11628
All torch.nn
Modules require batched inputs, and it seems in your case you have no batch dimension. Without knowing your code I'm assuming you are using
my_input.shape == (3, 256, 256)
But you will need to add a batch dimension, that is, you need to have
my_input.shape == (1, 3, 256, 256)
You can easily do that by introducing a dummy dimension using:
my_input = my_input[None, ...]
Upvotes: 1