Reputation: 3368
How do I change the number of input channels in the torchvision ConvNeXt model? I am working with grayscale images and want 1 input channel instead of 3.
import torch
from torchvision.models.convnext import ConvNeXt, CNBlockConfig
# this is the given configuration for the 'tiny' model
block_setting = [
CNBlockConfig(96, 192, 3),
CNBlockConfig(192, 384, 3),
CNBlockConfig(384, 768, 9),
CNBlockConfig(768, None, 3),
]
model = ConvNeXt(block_setting)
# my sample image (N, C, W, H) = (16, 1, 50, 50)
im = torch.randn(16, 1, 50, 50)
# forward pass
model(im)
output:
RuntimeError: Given groups=1, weight of size [96, 3, 4, 4], expected input[16, 1, 50, 50] to have 3 channels, but got 1 channels instead
However, if I change my input shape to (16, 3, 50, 50)
it seems to work fine.
The torchvision source code seems to be based of their github implementation but where do I specify in_chans
with the torchvision interface?
Upvotes: 1
Views: 1123
Reputation: 6135
You can rewrite the whole input layer, model._modules["features"][0][0]
is
nn.Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4))
Then, you only need to change the in_channels
>>> model._modules["features"][0][0] = nn.Conv2d(1, 96, kernel_size=(4, 4), stride=(4, 4))
>>> model(im)
tensor([[-0.4854, -0.1925, 0.1051, ..., -0.2310, -0.8830, -0.0251],
[ 0.3332, -0.4205, -0.3007, ..., 0.8530, 0.1429, -0.3819],
[ 0.1794, -0.7546, -0.7835, ..., -0.8072, -0.0972, 0.7413],
...,
[ 0.1356, 0.0868, 0.6135, ..., -0.1382, -0.2001, 0.2415],
[-0.1612, -0.4812, 0.1271, ..., -0.6594, 0.2706, 1.0833],
[ 0.0243, -0.5039, -0.4086, ..., 0.4233, 0.0389, 0.2787]],
grad_fn=<AddmmBackward0>)
Upvotes: 4