Reputation: 47
I am trying to run pretrained googlenet model from caffe model zoo (no finetuning). The model and deploy.prototxt are both downloaded from https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet
Below is the code I'm using:
net = caffe.Net('deploy.prototxt', 'bvlc_googlenet.caffemodel', caffe.TEST)
net.blobs['data'].reshape(1,3,224,224)
image_path = '1.png'
img = caffe.io.load_image(image_path)
img = caffe.io.resize( img, (224, 224, 3) )
# mean subtraction
img[0,:,:] -= 104 / 255.0
img[1,:,:] -= 117 / 255.0
img[2,:,:] -= 123 / 255.0
# 224,224,3 -> 3,224,224
img = np.transpose(img, (2, 0, 1))
out = net.forward(data=np.array([img]))['prob']
print(np.argmax(out))
Looks like model loads fine, nevertheless regardless of input it always outputs the same class (885). What can be the reason?
UPD: Actually the same problem applies to other models regardless of whether I do mean subtraction or not, just the class that is being always detected changes to some different.
Upvotes: 1
Views: 888
Reputation: 36
I can see a few problems with the code. First is you should use np.transpose
before setting the mean, because in caffe.io.load
, the image still has the shape (224,224,3). Second is that you need to rescale the images from [0,1] to [0,255]. Also caffe expects the image in a certain order. Small explanation is given here. So you will have to change the default RGB to BGR format.
I would recommend the use of transformer caffe.io.transformer
, which packs all these transformations cleanly.
For your example, a code with transformer would be :
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_mean('data', np.array([104,117,123]))
transformer.set_transpose('data',(2,0,1))
transformer.set_channel_swap('data',(2,1,0))
transformer.set_raw_scale('data', 255.0)
image_path = 'cat.jpg'
img = caffe.io.load_image(image_path)
img = caffe.io.resize( img, (224, 224, 3) )
net.blobs['data'].reshape(1,3,224,224)
net.blobs['data'].data[:,:,:] = transformer.preprocess('data',img)
output = net.forward()
out = net.blobs['prob'].data[0].flatten()
labels = np.loadtxt(labels_file, str, delimiter='\t')
print(np.argmax(out))
print ('output label : ' + labels[out.argmax()])
Upvotes: 2