Reputation: 11722
It seems that I have some conversion issue using coremltool
and a trained .caffemodel. I was able to train and test caffe
dogs model (120 categories, 20k images) and it passed my tests with direct caffe
classifications. Unfortunately, after converting to mlmodel
it doesn't give me a valid prediction on the same input.
Traning model
The model has been trained using Caffe, GoogleNet, set of 20k images over 120 categories packed into lmdb and about 500k iterations. I've prepared images database and all the rest and put all files together here
Classification with caffe
This classification example by caffe
. When I'm trying to run a classification request against the trained caffemodel
- it works just great, high probability (80-99%), right results:
Classification with Apple
iOS 11
CoreML
Unfortunately, when I'm trying to pack this DTDogs.caffemodel
& deploy.txt
into .mlmodel consumable by Apple iOS 11 CoreML
I have different prediction results. Actually, no errors loading and using the model but I'm unable to get valid classifications, all the predictions are 0-15% confidence and have wrong labels. In order to test it properly I'm using exactly the same images I used for the direct classification with caffe
:
I've also tried the pre-trained and pre-packed models with my iOS app from here - they work just fine so it seems to be an issue with packing procedure.
What did I miss?
Here is the example of classification with caffe
: no issues, right answers (python
):
import numpy as np
import sys
import caffe
import os
import urllib2
import matplotlib.pyplot as plt
%matplotlib inline
test_folder = '/home/<username>/Desktop/CaffeTest/'
test_image_path = "http://cdn.akc.org/content/hero/irish-terrier-Hero.jpg"
# init caffe net
model_def = test_folder + 'deploy.prototxt'
model_weights = test_folder + 'DTDogs.caffemodel'
# caffe.set_mode_gpu()
net = caffe.Net(model_def, model_weights, caffe.TEST)
# prepare transformer
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(1, 3, 256, 256)
test_image = urllib2.urlopen(test_image_path)
with open(test_folder + 'testImage.jpg','wb') as output:
output.write(test_image.read())
image = caffe.io.load_image(test_folder + 'testImage.jpg')
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
# classify
output = net.forward()
output_prob = output['prob'][0]
output_prob_val = output_prob.max() * 100
output_prob_ind = output_prob.argmax()
labels_file = test_folder + 'labels.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
plt.imshow(image)
print 'predicted class is:', output_prob_ind
print 'predicted probabily is:', output_prob_val
print 'output label:', labels[output_prob_ind]
Here is the example of packing the DTDogs.mlmodel
model using coremltools
. I see that resulted .mlmodel
file is twice smaller then the original .caffemodel
but it's probably some kind of archiving or compression optimisation by coremltools
(python
):
import coremltools;
caffe_model = ('DTDogs.caffemodel', 'deploy.prototxt')
labels = 'labels.txt'
coreml_model = coremltools.converters.caffe.convert(caffe_model, class_labels = labels, image_input_names= "data")
coreml_model.short_description = "Dogs Model v1.14"
coreml_model.save('DTDogs.mlmodel')
Here is an example of using the DTDogs.mlmodel
in the app. I'm using a regular image picker to pick the same image I used for the .caffe
classification test (swift
):
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
picker.dismiss(animated: true)
print("Analyzing Image…")
guard let uiImage = info[UIImagePickerControllerOriginalImage] as? UIImage
else { print("no image from image picker"); return }
guard let ciImage = CIImage(image: uiImage)
else { print("can't create CIImage from UIImage"); return }
imageView.image = uiImage
do {
let model = try VNCoreMLModel(for: DTDogs().model)
let classificationRequest = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
let orientation = CGImagePropertyOrientation(uiImage.imageOrientation)
let handler = VNImageRequestHandler(ciImage: ciImage, orientation: Int32(orientation.rawValue))
try handler.perform([classificationRequest])
} catch {
print(error)
}
}
Upvotes: 3
Views: 1068
Reputation: 7892
Usually what happens in these cases is that the image Core ML is passing into the model is not in the right format.
In the case of Caffe models, you typically need to set is_bgr=True
when you call caffe.convert()
, and you'll typically have to pass in mean values of RGB that will be subtracted from the input image, and possibly a scaling value as well.
In other words, Core ML needs to do the same stuff that your transformer
does in the Python script.
Something like this:
coreml_model = coremltools.converters.caffe.convert(
caffe_model, class_labels = labels, image_input_names= "data",
is_bgr=True, image_scale=255.)
I'm not sure if the image_scale=255.
is needed but it's worth a try. :-)
Upvotes: 4