Reputation: 688
1d-convolution is pretty simple when it is done by hand. However, I want to implement what is done here using nn.Conv1d
and it is not simple for me to do it. In this example h=[1,2,-1], x=[4,1,2,5] and the output is going to be y=[4,9,0,8,8,-5]. To do it using Pytorch we need to define h=nn.Conv1d(in, out, k)
and x=torch.tensor(*)
and y=h(x)
should be the result.
Note: please do not use nn.Conv2d
to implement it.
Upvotes: 4
Views: 6740
Reputation: 22174
First, you should be aware that the term "convolution" used in basically all literature related to convolutional neural networks (CNNs) actually corresponds to the correlation operation not the convolution operation.
The only difference (for real-valued inputs) between correlation and convolution is that in convolution the kernel is flipped/mirrored before sliding it across the signal, whereas in correlation no such flipping occurs.
There are also some extra operations that convolution layers in CNNs perform that are not part of the definition of convolution. They apply an offset (a.k.a. bias), they operate on mini-batches, and they map multi-channel inputs to multi-channel outputs.
Therefore, in order to recreate a convolution operation using a convolution layer we should (i) disable bias, (ii) flip the kernel, and (iii) set batch-size, input channels, and output channels to one.
For example, a PyTorch implementation of the convolution operation using nn.Conv1d
looks like this:
import torch
from torch import nn
x = torch.tensor([4, 1, 2, 5], dtype=torch.float)
k = torch.tensor([1, 2, -1], dtype=torch.float)
# Define these constants to differentiate the various usages of "1".
BATCH_SIZE, IN_CH, OUT_CH = 1, 1, 1
# Pad with len(k)-1 zeros to ensure all non-zero outputs are computed.
h = nn.Conv1d(IN_CH, OUT_CH, kernel_size=len(k), padding=len(k) - 1, bias=False)
# Copy flipped k into h.weight.
# h.weight is shape (OUT_CH, IN_CH, kernel_size), reshape k accordingly.
# Perform copy inside no_grad context to avoid autograd issues.
with torch.no_grad():
h.weight.copy_(torch.flip(k, dims=[0]).reshape(OUT_CH, IN_CH, -1))
# Input shape to h is assumed to be (BATCH_SIZE, IN_CH, SIGNAL_LENGTH), reshape x accordingly.
# Output shape of h is (BATCH_SIZE, OUT_CH, OUTPUT_LENGTH), reshape output to 1D signal.
y = h(x.reshape(BATCH_SIZE, IN_CH, -1)).reshape(-1)
which results in
>>> print(y)
tensor([ 4., 9., 0., 8., 8., -5.], grad_fn=<ViewBackward>)
Upvotes: 6