Reputation: 111
I have written a multi-view face detection code using opencv face detector. I am running five detectors (trained for different pose angles) over an image and taking their weights to detect faces in an image. I have made the code parallel using TBB parallel_for but it improved the performance by just 1.7-times. I would like to ask if there is any better way to run five detectors in parallel?
I am running my code on a cluster with 16-cores. I think number of threads (that in my case are 5) are too less to utilize the complete power.
Any suggestions?
Thanks,
Upvotes: 0
Views: 822
Reputation: 4049
Some possible problems to look into:
A profiler such as Intel(R) VTune(TM) Amplifier can sometimes help to track down these problems. Both commercial and non-commercial licenses exist for Amplifier. [Disclaimer: I work for Intel.]
Upvotes: 1