Reputation: 2205
I want my pytorch CNN to take as input a sequence of length SEQ_LEN
of 32x32 RGB images concatenated along channels dimension. Therefore, a single input of the network has shape (32, 32, 3, SEQ_LEN)
. How should I define my CNN input layer?
The common way
SEQ_LEN = 10
input_conv = nn.Conv2d(in_channels=SEQ_LEN, out_channels=32, kernel_size=3)
BATCH_SIZE = 64
frames = np.random.randint(0, 255, size=(BATCH_SIZE, SEQ_LEN, 3, 32, 32))
frames_tensor = torch.tensor(frames)
input_conv(frames_tensor)
gives the error
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 10, 3, 3], but got 5-dimensional input of size [64, 10, 3, 32, 32] instead
Upvotes: 2
Views: 379