stranger
stranger

Reputation: 3

RuntimeError: Given input size: (64x1x1). Calculated output size: (64x0x0). Output size is too small

My model is:

def forward(self, x):
    x = self.first_bn(x)
    x = self.selu(x)

    x0 = self.block0(x)
    y0 = self.avgpool(x0).view(x0.size(0), -1)
    y0 = self.fc_attention0(y0)
    y0 = self.sig(y0).view(y0.size(0), y0.size(1), -1)
    y0 = y0.unsqueeze(-1)
    x = x0 * y0 + y0

    x = nn.MaxPool2d(2)(x)

    x2 = self.block2(x)
    y2 = self.avgpool(x2).view(x2.size(0), -1)
    y2 = self.fc_attention2(y2)
    y2 = self.sig(y2).view(y2.size(0), y2.size(1), -1)
    y2 = y2.unsqueeze(-1)
    x = x2 * y2 + y2

    x = nn.MaxPool2d(2)(x)

    x4 = self.block4(x)
    y4 = self.avgpool(x4).view(x4.size(0), -1)
    y4 = self.fc_attention4(y4)
    y4 = self.sig(y4).view(y4.size(0), y4.size(1), -1)
    y4 = y4.unsqueeze(-1)
    x = x4 * y4 + y4

    x = nn.MaxPool2d(2)(x)

    x = self.bn_before_gru(x)
    x = self.selu(x)
    x = x.squeeze(-2)
    x = x.permute(0, 2, 1)
    self.gru.flatten_parameters()
    x, _ = self.gru(x)
    x = x[:, -1, :]
    x = self.fc1_gru(x)
    x = self.fc2_gru(x)

    return x

def _make_attention_fc(self, in_features, l_out_features):
    l_fc = []
    l_fc.append(nn.Linear(in_features=in_features, out_features=l_out_features))
    return nn.Sequential(*l_fc)

to solve RuntimeError: Given input size: (64x1x1). Calculated output size: (64x0x0). Output size is too small this error please give solution

Upvotes: 0

Views: 159

Answers (1)

Chih-Hao Liu
Chih-Hao Liu

Reputation: 466

The primary issue lies in your input size.

If you examine the SpecRNet architecture, you'll notice that it includes some MaxPool2d modules.

Let's consider an example where we input a tensor with the size (8, 1, 64, 64).

Here are the outputs of each layer within the SpecRNet.

INPUT:  torch.Size([8, 1, 64, 64])
first_bn(x):  torch.Size([8, 1, 64, 64])
selu(x):  torch.Size([8, 1, 64, 64])
block0(x):  torch.Size([8, 20, 32, 32]) ######
avgpool(x0).view(x0.size(0), -1):  torch.Size([8, 20])
fc_attention0(y0):  torch.Size([8, 20])
sig(y0).view(y0.size(0), y0.size(1), -1):  torch.Size([8, 20, 1])
unsqueeze(-1):  torch.Size([8, 20, 1, 1])
x0 * y0 + y0:  torch.Size([8, 20, 32, 32])
MaxPool2d(2)(x):  torch.Size([8, 20, 16, 16]) ######
block2(x):  torch.Size([8, 64, 8, 8]) ######
avgpool(x2).view(x2.size(0), -1):  torch.Size([8, 64])
fc_attention2(y2):  torch.Size([8, 64])
sig(y2).view(y2.size(0), y2.size(1), -1):  torch.Size([8, 64, 1])
unsqueeze(-1):  torch.Size([8, 64, 1, 1])
x2 * y2 + y2:  torch.Size([8, 64, 8, 8])
MaxPool2d(2)(x):  torch.Size([8, 64, 4, 4]) ######
block4(x):  torch.Size([8, 64, 2, 2]) ######
avgpool(x4).view(x4.size(0), -1):  torch.Size([8, 64])
fc_attention4(y4):  torch.Size([8, 64])
sig(y4).view(y4.size(0), y4.size(1), -1):  torch.Size([8, 64, 1])
unsqueeze(-1):  torch.Size([8, 64, 1, 1])
x4 * y4 + y4:  torch.Size([8, 64, 2, 2])
MaxPool2d(2)(x):  torch.Size([8, 64, 1, 1]) ######
bn_before_gru(x):  torch.Size([8, 64, 1, 1])
selu(x):  torch.Size([8, 64, 1, 1])
squeeze(-2) torch.Size([8, 64, 1])
permute(0, 2, 1):  torch.Size([8, 1, 64])
gru(x):  torch.Size([8, 1, 128])
fc1_gru(x):  torch.Size([8, 128])
fc2_gru(x):  torch.Size([8, 1])
OUTPUT:  torch.Size([8, 1])

We observe that the shape is halved after passing through block0, block2, block4, and undergoing MaxPool2d operations.

Since SpecRNet utilizes block0, block2, block4, and applies MaxPool2d 3 times, your input size should ideally be 2^6, which equals 64.

On the other hand, because you define your model architecture in config.py as

def get_specrnet_config(input_channels: int) -> Dict:
    return {
        "filts": [input_channels, [input_channels, 20], [20, 64], [64, 64]],
        "nb_fc_node": 64,
        "gru_node": 64,
        "nb_gru_layer": 2,
        "nb_classes": 1,
    }
specrnet_config = get_specrnet_config(input_channels=1)

It means that your input channel is 1.

In summation, your input size should be (batch_size,1,64,64).

Upvotes: 0

Related Questions