Reputation: 2478
I am using a pre-trained ResNet-50 model where the last dense is removed and the output from the average pooling layer is flattened. This is done for feature extraction purposes. The images are read from folder after being resized to (300, 300); it's RGB images.
torch version: 1.8.1 & torchvision version: 0.9.1 with Python 3.8.
The code is as follows:
model_resnet50 = torchvision.models.resnet50(pretrained = True)
# To remove last dense layer from pre-trained model, Use code-
model_resnet50_modified = torch.nn.Sequential(*list(model_resnet50.children())[:-1])
# Using 'AdaptiveAvgPool2d' layer, the predictions have shape-
model_resnet50_modified(images).shape
# torch.Size([32, 2048, 1, 1])
# Add a flatten layer after 'AdaptiveAvgPool2d(output_size=(1, 1))' layer at the end-
model_resnet50_modified.flatten = nn.Flatten()
# Sanity check- make predictions using a batch of images-
predictions = model_resnet50_modified(images)
predictions.shape
# torch.Size([32, 2048])
I want to now feed batches of images to this model and concatenate the predictions made by the model (32, 2048) vertically.
# number of images in training and validation sets-
len(dataset_train), len(dataset_val)
# (22500, 2500)
There are a total of 22500 + 2500 = 25000 images. So the final table/matrix should have the shape: (25000, 2048) -> number of images = 25000 and number of extracted features = 2048.
I tried running a toy code using np.vstack() as follows:
x = np.random.random_sample(size = (1, 3))
x.shape
# (1, 3)
x
# array([[0.52381798, 0.12345404, 0.1556422 ]])
for i in range(5):
y = np.random.random_sample(size = (1, 3))
np.vstack((x, y))
x
# array([[0.52381798, 0.12345404, 0.1556422 ]])
Solution(s)?
Thanks!
Upvotes: 0
Views: 795
Reputation: 677
If you want to stack the results in a Tensor:
results = torch.empty((0,2048))
results.to(device)
results = torch.cat((results, predictions), 0)
Upvotes: 1