Pytorch: How to transform image patches into matrix of feature vectors?

Question

For use as input in a neural network, I want to obtain a matrix of feature vectors from image patches. I'm using the Fashion-MNIST dataset (28x28 images) and have used Tensor.unfold to obtain patches (16 7x7 patches) by doing:

#example on one image
mnist_train = torchvision.datasets.FashionMNIST(
        root="../data", train=True, transform=transforms.Compose([transforms.ToTensor()]), download=True)
x = mnist_train[0][0][-1, :, :]
x = x.unfold(0, 7, 7).unfold(1, 7, 7)
x.shape
>>> torch.Size([4, 4, 7, 7])

Here I end up with a 4x4 tensor of 7x7 patches, however I want to vectorize each patch to obtain a matrix X with dimensions (16: number of patches x d: dimensions of feature vector). I'm unsure whether flatten() can be used here and how I would go about using it.

DerekG · Accepted Answer

To close this out, moving the content of the comments to here:

#example on one image
mnist_train = torchvision.datasets.FashionMNIST(
    root="../data", train=True, 
transform=transforms.Compose([transforms.ToTensor()]), download=True)
x = mnist_train[0][0][-1, :, :]
x = x.unfold(0, 7, 7).unfold(1, 7, 7)
x.shape

Output:

>>> torch.Size([4, 4, 7, 7])

And then:

x.reshape(-1,7,7)
x.shape

Output:

torch.Size([16,7,7])

Pytorch: How to transform image patches into matrix of feature vectors?

Answers (1)

Related Questions