ConvNeXt torchvision - specify input channels

Question

How do I change the number of input channels in the torchvision ConvNeXt model? I am working with grayscale images and want 1 input channel instead of 3.

import torch
from torchvision.models.convnext import ConvNeXt, CNBlockConfig

# this is the given configuration for the 'tiny' model
block_setting = [
    CNBlockConfig(96, 192, 3),
    CNBlockConfig(192, 384, 3),
    CNBlockConfig(384, 768, 9),
    CNBlockConfig(768, None, 3),
]

model = ConvNeXt(block_setting)

# my sample image (N, C, W, H) = (16, 1, 50, 50)
im = torch.randn(16, 1, 50, 50)
# forward pass
model(im)

output:

RuntimeError: Given groups=1, weight of size [96, 3, 4, 4], expected input[16, 1, 50, 50] to have 3 channels, but got 1 channels instead

However, if I change my input shape to (16, 3, 50, 50) it seems to work fine.

The torchvision source code seems to be based of their github implementation but where do I specify in_chans with the torchvision interface?

ConvNeXt torchvision - specify input channels

Answers (1)

Related Questions