Reputation: 53916
To render an image if shape 27x35 I use :
random_image = []
for x in range(1 , 946):
random_image.append(random.randint(0 , 255))
random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))
This generates :
I then try to apply a convolution to the image using the torch.nn.Conv2d
:
conv2 = torch.nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1)
image_d = np.asarray(random_image_arr.reshape(27 , 35))
conv2(torch.from_numpy(image_d))
But this displays error :
~/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
299 def forward(self, input):
300 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 301 self.padding, self.dilation, self.groups)
302
303
RuntimeError: input has less dimensions than expected
The shape of the input image_d
is (27, 35)
Should I change the parameters of Conv2d
in order to apply the convolution to the image ?
Update. From @McLawrence answer I have :
random_image = []
for x in range(1 , 946):
random_image.append(random.randint(0 , 255))
random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))
This renders image :
Applying the convolution operation :
conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)
image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35))).numpy()
fc = conv2(torch.from_numpy(image_d))
matplotlib.pyplot.imshow(fc[0][0].data.numpy())
renders image :
Upvotes: 4
Views: 6857
Reputation: 1539
I implemented a simplified version using scipy for learning purposes:
import numpy as np
from scipy import signal
def conv2d_simplified(input, weight, bias=None, padding=0):
# This is an implemention of torch's conv2d using scipy correlate2d. Only
# limited options are supported for simplicity.
# Inspired by https://github.com/99991/NumPyConv2D/
c_out, c_in_by_groups, kh, kw = weight.shape
if not isinstance(padding, int):
raise NotImplementedError()
if padding:
input = np.pad(input, ((0, 0), (0, 0), (padding, padding), (padding, padding)), "constant")
outArr = np.empty((input.shape[0], c_out, input.shape[2]+1-kh, input.shape[3]+1-kw))
al = np.empty((outArr.shape[2], outArr.shape[3]))
for k in range(input.shape[0]):
for i in range(weight.shape[0]):
al[:, :] = 0.0
for j in range(weight.shape[1]):
al += signal.correlate2d(input[k, j, :, :], weight[i, j, :, :], 'valid')
outArr[k, i, :, :] = al
if bias is not None:
outArr = outArr + bias.reshape(1, c_out, 1, 1)
return outArr
To apply a single kernel to a single greyscale image, we want i=0, j=0, k=0. This means the input size is (1, 1, ?, ?) and the kernel size is (1, 1, 3, 3). The function takes 4D inputs because they are used in deep learning networks.
You would need to flip the kernel as well as conv2d uses the cross-correlation operator.
Upvotes: 0
Reputation: 5255
There are two problems with your code:
First, 2d convolutions in pytorch
are defined only for 4d tensors.
This is convenient for use in neural networks. The first dimension is the batch size while the second dimension are the channels (a RGB image for example has three channels). So you have to reshape your tensor like
image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35)))
The FloatTensor
is important here, since convolutions are not defined on the LongTensor
which will be created automatically if your numpy
array only includes int
s.
Secondly, You have created a convolution with three input channels, while your image has just one channel (it is greyscale). So you have to adjust the convolution to:
conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)
Upvotes: 5