Can CNNs be faster than classic descriptors?

Disclamer: I don't know almost nothing on CNNs and I have no idea where I could ask this.

My research is focused on high performance on computer vision applications. We generate codes representing an image in less than 20 ms on images with the largest size of 500pxs.

This is done by combining SURF descriptors and VLAD codes, obtaining a vector representing an image that will be used in our object recognition application.

Can CNNs be faster? According to this benchmark (which is based on much smaller images) the times needed is longer, almost double considering that the size of the image is half of ours.

Upvotes: 1

Answers (3)

Martin Thoma

Reputation: 136665

Answer to your question: Yes, they can. They can be slower and they can be faster than classic descriptors. For example, using only a single filter and several max-poolings will almost certainly be faster. But the results will also certainly be crappy.

You should ask a much more specific question. Relevant parts are:

Problem: Classification / Detection / Semantic Segmentation / Instance Segmentation / Face verification / ... ?
Constraints: Minimum accuracy / maximum speed / maximum latency?
Evaluation specifics:
- Which hardware is available (GPUs)?
- Do you evaluate on a single image? Often you can evaluate up to 512 images in about the same time as one image.

Also: The input image size should not be relevant. If CNNs achieve better results on smaller inputs than classic descriptors, why should you care?

Papers

Please note that CNNs are usually not tweaked towards speed, but towards accuracy.

Detection: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks: 600px x ~800px in 200ms on a GPU
InverseFaceNet: Deep Single-Shot Inverse Face Rendering From A Single Image: 9.79ms with GeForce GTX Titan and AlexNet to get FC7 features
Semantic segmentation: Pixel-wise Segmentation of Street with Neural Networks 20ms with GeForce GTX 980

Upvotes: 1

Dr. Snoopy

Reputation: 56377

Yes, they can be faster. The numbers you got are for networks trained for ImageNet classification, 1 Million images, 1000 classes. Unless your classification problem is similar, then using a ImageNet network is overkill.

You should also remember that these networks have weights in the order of 10-100 million, so they are quite expensive to evaluate. But you probably don't need a really big network, and you can design your own network, with less layers and parameters that is much cheaper to evaluate.

In my experience, I designed a network to classify 96x96 sonar image patches, and with around 4000 weights in total, it can get over 95% classification accuracy and run at 40 ms per frame on a RPi2.

A bigger network with 900K weights, same input size, takes 7 ms to evaluate on a Core i7. So this is surely possible, you just need to play with smaller network architectures. A good start is SqueezeNet, which is a network that can achieve good performance in Imagenet, but has 50 times less weights, and it is of course much faster than other networks.

Upvotes: 2

duffymo

Reputation: 308968

I would be wary of benchmarks and blanket statements. It's important to know every detail that went into generating the quoted values. For example, would running CNN on GPU hardware improve the quoted values?

20ms seems very fast to me; so does 40ms. I have no idea what your requirement is.

What other benefits could CNN offer? Maybe it's more than just raw speed.

I don't believe that neural networks are the perfect technique for every problem. Regression, SVM, and other classification techniques are still viable.

There's a bias at work here. Your question reads as if you are looking only to confirm that your current research is best. You have a sunk cost that you're loath to throw away, but you're worried that there might be something better out there. If that's true, I don't think this is a good question for SO.

"I don't know almost nothing on CNNs" - if you're a true researcher, seeking the truth, I think you have an obligation to learn and answer for yourself. TensorFlow and Keras make this easy to do.

Upvotes: 1

Can CNNs be faster than classic descriptors?

Answers (3)

Papers

Related Questions