flybrain
flybrain

Reputation: 119

How to use AlexNet with one channel

I am new to pytorch and had a problem with channels in AlexNet. I am using it for a ‘gta san andreas self driving car’ project, I collected the dataset from a black and white image that has one channel and trying to train AlexNet using the script:

from AlexNetPytorch import*
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim
import torch.utils.data
import numpy as np
import torch
from IPython.core.debugger import set_trace

AlexNet = AlexNet()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(AlexNet.parameters(), lr=0.001, momentum=0.9)

all_data = np.load('training_data.npy')
inputs= all_data[:,0]
labels= all_data[:,1]
inputs_tensors = torch.stack([torch.Tensor(i) for i in inputs])
labels_tensors = torch.stack([torch.Tensor(i) for i in labels])

data_set = torch.utils.data.TensorDataset(inputs_tensors,labels_tensors)
data_loader = torch.utils.data.DataLoader(data_set, batch_size=3,shuffle=True, num_workers=2)




if __name__ == '__main__':
 for epoch in range(8):
  runing_loss = 0.0
  for i,data in enumerate(data_loader , 0):
     inputs= data[0]
     inputs = torch.FloatTensor(inputs)
     labels= data[1]
     labels = torch.FloatTensor(labels)
     optimizer.zero_grad()
     # set_trace()
     inputs = torch.unsqueeze(inputs, 1)
     outputs = AlexNet(inputs)
     loss = criterion(outputs , labels)
     loss.backward()
     optimizer.step()

     runing_loss +=loss.item()
     if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0
 print('finished')

I am using AlexNet from the link: https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py

But changed line 18 from :

nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2)

To :

nn.Conv2d(1, 64, kernel_size=11, stride=4, padding=2)

Because I am using only one channel in training images, but I get this error:

 File "training_script.py", line 44, in <module>
    outputs = AlexNet(inputs)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\Mukhtar\Documents\AI_projects\gta\AlexNetPytorch.py", line 34, in forward
    x = self.features(x)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
    input = module(input)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\pooling.py", line 142, in forward
    self.return_indices)
  File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\functional.py", line 396, in max_pool2d
    ret = torch._C._nn.max_pool2d_with_indices(input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: Given input size: (256x1x1). Calculated output size: (256x0x0). Output size is too small at c:\programdata\miniconda3\conda-bld\pytorch-cpu_1532499824793\work\aten\src\thnn\generic/SpatialDilatedMaxPooling.c:67

I don't know what is wrong, is it wrong to change the channel size like this, and if it is wrong can you please lead me to a neural network that work with one channel , as I said I am a newbie in pytorch and I don't want to write the nn myself.

Upvotes: 1

Views: 2462

Answers (2)

flybrain
flybrain

Reputation: 119

The problem was with the size of my input, I gave it a (32x32) when I should have given it a (224x224) -I am new to AlexNet so I didn't know that it takes that size-. I reshaped my images to (224x224) and now I am training the CNN.

Upvotes: 0

Shai
Shai

Reputation: 114786

Your error is not related to using gray-scale images instead of RGB. Your error is about the spatial dimensions of the input: while "forwarding" an input image through the net, its size (in feature space) became zero - this is the error you see. You can use this nice guide to see what happens to the output size of each layer (conv/pooling) as a function of kernel size, stride and padding.
Alexnet expects its input images to be 224 by 224 pixels - make sure your inputs are of the same size.

Other things you overlooked:

  • You are using Alexnet architecture, but you are initializing it to random weights instead of using pretrained weights (trained on imagenet). To get a trained copy of alexnet you'll need to instantiate the net like this

    AlexNet = alexnet(pretrained=True)
    
  • Once you decide to use pretrained net, you cannot change its first layer from 3 input channels to three (the trained weight simply won't fit). The easiest fix is to make your input images "colorful" by simply repeating the single channel three times. See repeat() for more info.

Upvotes: 3

Related Questions