Reputation: 3427
I have data set of colored images in the form of ndarray (100, 20, 20, 3) and 100 corresponding labels. When passing them as input to a fully connected neural network (not CNN), what should I do with the 3 values of RGB? Average them perhaps lose some information, but if not manipulating them, my main issue is batch size, as demo-ed below in pytorch.
for epoch in range(n_epochs):
for i, (images, labels) in enumerate(train_loader):
# because of rgb values, now images is 3 times the length of labels
images = Variable(images.view(-1, 400))
labels = Variable(labels)
optimizer.zero_grad()
outputs = net(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
This returns 'ValueError: Expected input batch_size (300) to match target batch_size (100).' Should I have reshaped images into (1, 1200) dimension tensors? Thanks in advance for answers.
Upvotes: 2
Views: 1259
Reputation: 4475
Since size of labels is (100,)
, so your batch data should be with shape of (100, H, W, C)
. I'm assuming your data loader is returning a tensor with shape of (100,20,20,3)
. The error happens because you reshape the tensor to (300,400)
.
Check your network architecture whether the input tensor shape is (20,20,3)
.
If your network can only accept single channel images, you can first convert your RGB to grayscale images.
Or, modify your network architecture to make it accept 3 channels images. One convenient way is adding an extra layer reducing 3 channels to 1 channel, and you do not need to change the other parts of the network.
Upvotes: 4