how to use RGB values in feedforward neural network?

Question

I have data set of colored images in the form of ndarray (100, 20, 20, 3) and 100 corresponding labels. When passing them as input to a fully connected neural network (not CNN), what should I do with the 3 values of RGB? Average them perhaps lose some information, but if not manipulating them, my main issue is batch size, as demo-ed below in pytorch.

for epoch in range(n_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # because of rgb values, now images is 3 times the length of labels
        images = Variable(images.view(-1, 400))
        labels = Variable(labels)
        optimizer.zero_grad()
        outputs = net(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

This returns 'ValueError: Expected input batch_size (300) to match target batch_size (100).' Should I have reshaped images into (1, 1200) dimension tensors? Thanks in advance for answers.

zihaozhihao · Accepted Answer

Since size of labels is (100,), so your batch data should be with shape of (100, H, W, C). I'm assuming your data loader is returning a tensor with shape of (100,20,20,3). The error happens because you reshape the tensor to (300,400).

Check your network architecture whether the input tensor shape is (20,20,3).
If your network can only accept single channel images, you can first convert your RGB to grayscale images.
Or, modify your network architecture to make it accept 3 channels images. One convenient way is adding an extra layer reducing 3 channels to 1 channel, and you do not need to change the other parts of the network.

how to use RGB values in feedforward neural network?

Answers (2)

Related Questions