fufuzz
fufuzz

Reputation: 25

Is there any significant performance difference between using copy assignment constructor and iterating over elements to modify a Mat?

I have a pre-trained network in TF and want to preprocess the input image (convert to single channel float32 and normalize it to [-1 1]) before I pass it on to the network.

//initialize network
    dnn::Net net = readNetFromTensorflow(modelFile);
    assert(!net.empty());

    Mat frame = imread(imageFile, IMREAD_GRAYSCALE);

    cv::equalizeHist(frame, frame);   
    Mat procFrame(frame.size(), CV_32FC1);

Is there any performance difference between the following two ways of doing the preprocessing and which one is more efficient?

// preprocess 1st way

    for (int i = 0;  i < frame.rows; i++) {
        for (int j = 0; j < frame.cols; j++){
            procFrame.at<float>(i, j) = frame.at<uint8_t>(i, j)*(2. / 255.) - 1.;
        }
    }

or

// preprocess 2nd way                   
    procFrame = frame*(2./ 255.);                                                           
    procFrame -= 1.;

Upvotes: 0

Views: 170

Answers (2)

Ali
Ali

Reputation: 1043

Performance is not always the final goal. If you use the second approach, you will use the Mat's assignment operator. It means that any exceptions during the operation will be handled inside the Mat& operator= (const Mat &m) which is written by professionals.

Anyway, I believe that using a Mat& operator= (const Mat &m) is the most efficient and least error prone way of assignment. Do not try to write it yourself by iterating on Mat elements.

Upvotes: 1

Alex
Alex

Reputation: 877

The first way you do copying is preferable because you only copy and convert to float in one direction instead of writing back to the original variable.

The second way you perform the arithmetic is VASTLY SUPERIOR because rolling your own loops in OpenCV is almost always slower than using builtin arithmetic functions because OpenCV is compiled to use SIMD vectorization and many other forms of optimization at the assembly level.

But both of these are inefficient regarding memory allocation (significant slowdowns) if you know ahead of time what size your frames will be because you constantly allocate and deallocate the float matrix for the conversion.

Keep a preallocated float matrix in memory to avoid this by declaring procFrame as static (static Mat) and constructing it immediately as you are doing to be the proper size.

    static Mat procFrame(frame.size(), CV_32FC1);
    cv::equalizeHist(frame, frame);
    frame.convertTo(procFrame, CV_32FC1);                                   
    procFrame = procFrame*(2./ 255.);                                                           
    procFrame -= 1.;

If your frame size changes during runtime, then don't declare it as static.

Upvotes: 1

Related Questions