Nikhil R
Nikhil R

Reputation: 115

Why Resnet model in tensorflow and pytorch give different feature length?

I'm trying to extract features of images through Resnet models pretrained on imagenet dataset as for the network should give the length of 2048 features. When I experimented with TensorFlow it gave the same amount of feature-length but when I try PyTorch version Resnet it gives me the length of 1000.

codes are as below for Tensorflow

import numpy as np
from numpy.linalg import norm
import pickle
from tqdm import tqdm, tqdm_notebook
import os
import random
import time
import math
import tensorflow
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, Dropout, GlobalAveragePooling2D

def model_picker(name):
    if (name == 'vgg16'):
        model = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=(224, 224, 3),
                      pooling='max')
    elif (name == 'vgg19'):
        model = VGG19(weights='imagenet',
                      include_top=False,
                      input_shape=(224, 224, 3),
                      pooling='max')
    elif (name == 'mobilenet'):
        model = MobileNet(weights='imagenet',
                          include_top=False,
                          input_shape=(224, 224, 3),
                          pooling='max',
                          depth_multiplier=1,
                          alpha=1)
    elif (name == 'inception'):
        model = InceptionV3(weights='imagenet',
                            include_top=False,
                            input_shape=(224, 224, 3),
                            pooling='max')
    elif (name == 'resnet'):
        model = ResNet50(weights='imagenet',
                         include_top=False,
                         input_shape=(224, 224, 3),
                        pooling='max')
    elif (name == 'xception'):
        model = Xception(weights='imagenet',
                         include_top=False,
                         input_shape=(224, 224, 3),
                         pooling='max')
    else:
        print("Specified model not available")
    return model


model_architecture = 'resnet'
model = model_picker(model_architecture)

def extract_features(img_path, model):
    input_shape = (224, 224, 3)
    img = image.load_img(img_path,
                         target_size=(input_shape[0], input_shape[1]))
    img_array = image.img_to_array(img)
    expanded_img_array = np.expand_dims(img_array, axis=0)
    preprocessed_img = preprocess_input(expanded_img_array)
    features = model.predict(preprocessed_img)
    flattened_features = features.flatten()
    normalized_features = flattened_features / norm(flattened_features)
    return normalized_features
features = extract_features('dog.jpg', model)
print(len(features))

> 2048

As you can see it gives a length of 2048 features through the resnet50 model

Below is the code for PyTorch

from torchvision import models, transforms
from PIL import Image
from torch.autograd import Variable
import torch
res_model = models.resnet50(pretrained=True)
def image_loader(image,model,use_gpu= False):
  transform = transforms.Compose([
                                  transforms.Resize(256),
                                  transforms.CenterCrop(224),
                                  transforms.ToTensor()
  ])
  img = Image.open(image)
  img = transform(img)
  print(img.shape)

  x = Variable(torch.unsqueeze(img, dim = 0).float(), requires_grad = False)
  print(x.shape)
  if use_gpu:
    x = x.cuda()
    model = model.cuda()
  y = model(x).cpu()
  print(y.size())
  y = torch.squeeze(y)
  y = y.data.numpy()
  print(y.shape)
  print(len(y))
  np.savetxt('features.txt',y,delimiter=',')
image_loader('dog.jpg',res_model)

> torch.Size([3, 224, 224]) torch.Size([1, 3, 224, 224]) torch.Size([1,
> 1000]) (1000,) 1000

As you can see it gives a length of 1000 for the feature extracted through the Resnet model with the PyTorch model why am I getting different lengths isn't I get the same length according to architecture which is 2048 or am I doing anything wrong?

Upvotes: 1

Views: 1530

Answers (1)

DerekG
DerekG

Reputation: 3938

Printing the layers of the pytorch resnet will yield:

(fc): Linear(in_features=2048, out_features=1000, bias=True)

as the last layer of the resnet in Pytorch, because the model is by default set up for use as a classifier on imagenet data (1000 classes). If you want 2048 features instead, you can simply delete this last layer.

del model.fc

and your resulting output will then be of the desired dimension.

Edit: perhaps better is to simply overwrite model.fc with an identity function rather than deleting it so it doesn't cause errors when forward is called:

model.fc = torch.nn.Identity()

Upvotes: 3

Related Questions