Simple way to evaluate Pytorch torchvision on a single image

Question

I have a pre-trained model on Pytorch v1.3, torchvision v0.4.2 as the following:

import PIL, torch, torchvision
# Load and normalize the image
img_file = "./robot_image.jpg"
img = PIL.Image.open(img_file)
img = torchvision.transforms.ToTensor()((img))
img = 0.5 + 0.5 * (img - img.mean()) / img.std()

# Load a pre-trained network and compute its prediction
alexnet = torchvision.models.alexnet(pretrained=True)

I want to test this single image, but I get an error:

alexnet(img)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 11 11, but got 3-dimensional input of size [3, 741, 435] instead

what is the most simple and idiomatic way of getting the model to evaluate a single data point?

Jacob Deasy · Accepted Answer

AlexNet is expecting a 4-dimensional tensor of size (batch_size x channels x height x width). You are providing a 3-dimensional tensor.

To change your tensor to size (1, 3, 741, 435) simply add the line:

img = img.unsqueeze(0)

You will also need to downsample your image as AlexNet expects inputs of height and width 224x224.

Simple way to evaluate Pytorch torchvision on a single image

Answers (1)

Related Questions