Image classification with iOS 11 mlmodel - convert issue using coremltools and trained .caffemodel

Question

It seems that I have some conversion issue using coremltool and a trained .caffemodel. I was able to train and test caffe dogs model (120 categories, 20k images) and it passed my tests with direct caffe classifications. Unfortunately, after converting to mlmodel it doesn't give me a valid prediction on the same input.

Traning model

The model has been trained using Caffe, GoogleNet, set of 20k images over 120 categories packed into lmdb and about 500k iterations. I've prepared images database and all the rest and put all files together here

Classification with caffe

This classification example by caffe. When I'm trying to run a classification request against the trained caffemodel - it works just great, high probability (80-99%), right results:

Classification with Apple iOS 11 CoreML

Unfortunately, when I'm trying to pack this DTDogs.caffemodel & deploy.txt into .mlmodel consumable by Apple iOS 11 CoreML I have different prediction results. Actually, no errors loading and using the model but I'm unable to get valid classifications, all the predictions are 0-15% confidence and have wrong labels. In order to test it properly I'm using exactly the same images I used for the direct classification with caffe:

I've also tried the pre-trained and pre-packed models with my iOS app from here - they work just fine so it seems to be an issue with packing procedure.

What did I miss?

Here is the example of classification with caffe: no issues, right answers (python):

import numpy as np
import sys
import caffe
import os
import urllib2
import matplotlib.pyplot as plt
%matplotlib inline

test_folder = '/home//Desktop/CaffeTest/'
test_image_path = "http://cdn.akc.org/content/hero/irish-terrier-Hero.jpg"

# init caffe net
model_def = test_folder + 'deploy.prototxt'
model_weights = test_folder + 'DTDogs.caffemodel'
# caffe.set_mode_gpu()
net = caffe.Net(model_def, model_weights, caffe.TEST) 

# prepare transformer
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))  
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(1, 3, 256, 256)  

test_image = urllib2.urlopen(test_image_path) 
with open(test_folder + 'testImage.jpg','wb') as output:
  output.write(test_image.read())

image = caffe.io.load_image(test_folder + 'testImage.jpg')
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image

# classify
output = net.forward()
output_prob = output['prob'][0]
output_prob_val = output_prob.max() * 100
output_prob_ind = output_prob.argmax()
labels_file = test_folder + 'labels.txt'
labels = np.loadtxt(labels_file, str, delimiter='	')

plt.imshow(image)
print 'predicted class is:', output_prob_ind
print 'predicted probabily is:', output_prob_val
print 'output label:', labels[output_prob_ind]

Here is the example of packing the DTDogs.mlmodel model using coremltools. I see that resulted .mlmodel file is twice smaller then the original .caffemodel but it's probably some kind of archiving or compression optimisation by coremltools (python):

import coremltools;
caffe_model = ('DTDogs.caffemodel', 'deploy.prototxt')
labels = 'labels.txt'
coreml_model = coremltools.converters.caffe.convert(caffe_model, class_labels = labels, image_input_names= "data")
coreml_model.short_description = "Dogs Model v1.14"
coreml_model.save('DTDogs.mlmodel')

Here is an example of using the DTDogs.mlmodel in the app. I'm using a regular image picker to pick the same image I used for the .caffe classification test (swift):

func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
    picker.dismiss(animated: true)
    print("Analyzing Image…")

    guard let uiImage = info[UIImagePickerControllerOriginalImage] as? UIImage
        else { print("no image from image picker"); return }
    guard let ciImage = CIImage(image: uiImage)
        else { print("can't create CIImage from UIImage"); return }

    imageView.image = uiImage

    do {
        let model = try VNCoreMLModel(for: DTDogs().model)
        let classificationRequest = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
        let orientation = CGImagePropertyOrientation(uiImage.imageOrientation)
        let handler = VNImageRequestHandler(ciImage: ciImage, orientation: Int32(orientation.rawValue))
        try handler.perform([classificationRequest])
    } catch {
        print(error)
    }
}

Matthijs Hollemans · Accepted Answer

Usually what happens in these cases is that the image Core ML is passing into the model is not in the right format.

In the case of Caffe models, you typically need to set is_bgr=True when you call caffe.convert(), and you'll typically have to pass in mean values of RGB that will be subtracted from the input image, and possibly a scaling value as well.

In other words, Core ML needs to do the same stuff that your transformer does in the Python script.

Something like this:

coreml_model = coremltools.converters.caffe.convert(
    caffe_model, class_labels = labels, image_input_names= "data",
    is_bgr=True, image_scale=255.)

I'm not sure if the image_scale=255. is needed but it's worth a try. :-)

Image classification with iOS 11 mlmodel - convert issue using coremltools and trained .caffemodel

Answers (1)

Related Questions