Reputation: 1722
I am building a CNN to do image classification on the EMNIST dataset.
To do so, I have the following datasets:
import scipy .io
emnist = scipy.io.loadmat(DIRECTORY + '/emnist-letters.mat')
data = emnist ['dataset']
X_train = data ['train'][0, 0]['images'][0, 0]
X_train = X_train.reshape((-1,28,28), order='F')
y_train = data ['train'][0, 0]['labels'][0, 0]
X_test = data ['test'][0, 0]['images'][0, 0]
X_test = X_test.reshape((-1,28,28), order = 'F')
y_test = data ['test'][0, 0]['labels'][0, 0]
With shape:
Note that the pictures are grayscale, so the colors are represented with only one number.
Which I prepare further as follows:
train_dataset = torch.utils.data.TensorDataset(torch.from_numpy(X_train), torch.from_numpy(y_train))
test_dataset = torch.utils.data.TensorDataset(torch.from_numpy(X_test), torch.from_numpy(y_test))
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=batch_size,
shuffle=False)
My model looks as follows:
class CNNModel(nn.Module):
def __init__(self):
super(CNNModel, self).__init__()
self.cnn_layers = Sequential(
# Defining a 2D convolution layer
Conv2d(1, 4, kernel_size=3, stride=1, padding=1),
BatchNorm2d(4),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
# Defining another 2D convolution layer
Conv2d(4, 4, kernel_size=3, stride=1, padding=1),
BatchNorm2d(4),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
)
self.linear_layers = Sequential(
Linear(4 * 7 * 7, 10)
)
# Defining the forward pass
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
model = CNNModel()
The code below is part of the code that I use to train my model:
for epoch in range(num_epochs):
for i, (images, labels) in enumerate(train_loader):
images = Variable(images)
labels = Variable(labels)
# Forward pass to get output/logits
outputs = model(images)
However, by excuting my code, I get the following error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [4, 1, 3, 3], but got 3-dimensional input of size [100, 28, 28] instead
So a 4D input is expected while my input is 3D. What can I do so a 3D model is expected rather than a 4D model?
Here a similar question is asked, however I dont see how I can translate this to my code
Upvotes: 1
Views: 1255
Reputation: 32992
The convolution expects the input to have size [batch_size, channels, height, width], but your images have size [batch_size, height, width], the channel dimension is missing. Greyscale is represented with a single channel and you have correctly set the in_channels
of the first convolutions to 1, but your images don't have the matching dimension.
You can easily add the singular dimension with torch.unsqueeze
.
Also, please don't use Variable
, it was deprecated with PyTorch 0.4.0, which was released over 2 years ago, and all of its functionality has been merged into the tensors.
for i, (images, labels) in enumerate(train_loader):
# Add a single channel dimension
# From: [batch_size, height, width]
# To: [batch_size, 1, height, width]
images = images.unsqueeze(1)
# Forward pass to get output/logits
outputs = model(images)
Upvotes: 3