zhen lee
zhen lee

Reputation: 333

Finding entropy in opencv

I need a function like entropyfilt() in matlab, which doesn't exists in opencv.

In matlab, J = entropyfilt(I) returns the array J, where each output pixel contains the entropy value of the 9-by-9 neighborhood around the corresponding pixel in the input image I.

I wrote a function to implement it in c++, foreach pixel get its entropy like this:

  1. Use cvCalHist with the mask parameter appropriately set to get image ROI (That's a 9*9 rectangle).
  2. Normalize the histogram so the sum of its bins is equal to 1.
  3. Use the formula for (Shannon) entropy.

I list the C++ code below:

GetLocalEntroyImage( const IplImage*gray_src,IplImage*entopy_image){
    int hist_size[]={256};
    float gray_range[]={0,255};
    float* ranges[] = { gray_range};
    CvHistogram * hist = cvCreateHist( 1, hist_size, CV_HIST_SPARSE, ranges,1);
    for(int i=0;i<gray_src.width;i++){
            for(int j=0;j<gray_src.height;j++){
                //calculate entropy for pixel(i,j) 
                //1.set roi rect(9*9),handle edge pixel
                CvRect roi;
                int threshold=Max(0,i-4);
                roi.x=threshold;
                threshold=Max(0,j-4);
                roi.y=threshold;
                roi.width=(i-Max(0,i-4))+1+(Min(gray_src->width-1,i+4)-i);
                roi.height=(j-Max(0,j-4))+1+(Min(gray_src->height-1,j+4)-j);
                cvSetImageROI(const_cast<IplImage*>(gray_src),roi);
                IplImage*gray_src_non_const=const_cast<IplImage*>(gray_src);                            

                //2.calHist,here I chose CV_HIST_SPARSE to speed up
                cvCalcHist( &gray_src_non_const, hist, 0, 0 );*/
                cvNormalizeHist(hist,1.0);
                float total=0;
                float entroy=0;

               //3.get entroy
                CvSparseMatIterator it;
                for(CvSparseNode*node=cvInitSparseMatIterator((CvSparseMat*)hist-   >bins,&it);node!=0;node=cvGetNextSparseNode(&it)){
                float gray_frequency=*(float*)CV_NODE_VAL((CvSparseMat*)hist->bins,node);
                entroy=entroy-gray_frequency*(log(gray_frequency)/log(2.0f));//*(log(gray_frequency)/log(2.0))
                }
                ((float*)(local_entroy_image->imageData + j*local_entroy_image->widthStep))[i]=entroy;
                cvReleaseHist(&hist);
            }
        }
        cvResetImageROI(const_cast<IplImage*>(gray_src));
    }

However, the code is too slow. I tested it in a 600*1200 image and it costs 120s, while entroyfilt in matlab only takes 5s.

Does anyone know how to speed up it or know any other good implementation?

Upvotes: 5

Views: 7116

Answers (3)

Cynichniy Bandera
Cynichniy Bandera

Reputation: 6103

Nice one (already up voted). Here are some changes and notes that help to use it. In general I fixed memory leaks and some what converted it to c++ opencv (though a lot more improvements can be done). Works fine on ios too.

void getLocalEntropyImage(cv::Mat &gray, cv::Rect &roi, cv::Mat &entropy)
{
        using namespace cv;
        clock_t func_begin, func_end;
        func_begin = clock();
        //1.define nerghbood model,here it's 9*9
        int neighbood_dim = 2;
        int neighbood_size[] = {9, 9};

        //2.Pad gray_src
        Mat gray_src_mat(gray);
        Mat pad_mat;
        int left = (neighbood_size[0] - 1) / 2;
        int right = left;
        int top = (neighbood_size[1] - 1) / 2;
        int bottom = top;
        copyMakeBorder(gray_src_mat, pad_mat, top, bottom, left, right, BORDER_REPLICATE, 0);
        Mat *pad_src = &pad_mat;
        roi = cv::Rect(roi.x + top, roi.y + left, roi.width, roi.height);

        //3.initial neighbood object,reference to Matlab build-in neighbood object system
        //        int element_num = roi_rect.area();
        //here,implement a histogram by ourself ,each bin calcalate gray value frequence
        int hist_count[256] = {0};
        int neighbood_num = 1;
        for (int i = 0; i < neighbood_dim; i++)
                neighbood_num *= neighbood_size[i];

        //neighbood_corrds_array is a neighbors_num-by-neighbood_dim array containing relative offsets
        int *neighbood_corrds_array = (int *)malloc(sizeof(int)*neighbood_num * neighbood_dim);
        //Contains the cumulative product of the image_size array;used in the sub_to_ind and ind_to_sub calculations.
        int *cumprod = (int *)malloc(neighbood_dim * sizeof(*cumprod));
        cumprod[0] = 1;
        for (int i = 1; i < neighbood_dim; i++)
                cumprod[i] = cumprod[i - 1] * neighbood_size[i - 1];
        int *image_cumprod=(int*)malloc(2 * sizeof(*image_cumprod));
        image_cumprod[0] = 1;
        image_cumprod[1]= pad_src->cols;
        //initialize neighbood_corrds_array
        int p;
        int q;
        int *coords;
        for (p = 0; p < neighbood_num; p++){
                coords = neighbood_corrds_array+p * neighbood_dim;
                ind_to_sub(p, neighbood_dim, neighbood_size, cumprod, coords);
                for (q = 0; q < neighbood_dim; q++)
                        coords[q] -= (neighbood_size[q] - 1) / 2;
        }
        //initlalize neighbood_offset in use of neighbood_corrds_array
        int *neighbood_offset = (int *)malloc(sizeof(int) * neighbood_num);
        int *elem;
        for (int i = 0; i < neighbood_num; i++){
                elem = neighbood_corrds_array + i * neighbood_dim;
                neighbood_offset[i] = sub_to_ind(elem, image_cumprod, 2);
        }

        //4.calculate entroy for pixel
        uchar *array=(uchar *)pad_src->data;
        //here,use entroy_table to avoid frequency log function which cost losts of time
        float entroy_table[82];
        const float log2 = log(2.0f);
        entroy_table[0] = 0.0;
        float frequency = 0;
        for (int i = 1; i < 82; i++){
                frequency = (float)i / 81;
                entroy_table[i] = frequency * (log(frequency) / log2);
        }
        int neighbood_index;
        //        int max_index=pad_src->cols*pad_src->rows;
        float e;
        int current_index = 0;
        int current_index_in_origin = 0;
        for (int y = roi.y; y < roi.height; y++){
                current_index = y * pad_src->cols;
                current_index_in_origin = (y - 4) * gray.cols;
                for (int x = roi.x; x < roi.width; x++, current_index++, current_index_in_origin++) {
                        for (int j=0;j<neighbood_num;j++) {
                                neighbood_index = current_index+neighbood_offset[j];
                                hist_count[array[neighbood_index]]++;
                        }
                        //get entropy
                        e = 0;
                        for (int k = 0; k < 256; k++){
                                if (hist_count[k] != 0){
                                        //                                        int frequency=hist_count[k];
                                        e -= entroy_table[hist_count[k]];
                                        hist_count[k] = 0;
                                }
                        }
                        ((float *)entropy.data)[current_index_in_origin] = e;
                }
        }
        free(neighbood_offset);
        free(image_cumprod);
        free(cumprod);
        free(neighbood_corrds_array);

        func_end = clock();
        double func_time = (double)(func_end - func_begin) / CLOCKS_PER_SEC;
        std::cout << "func time" << func_time << std::endl;
}

Also here are missed functions.

static int32_t sub_to_ind(int32_t *coords, int32_t *cumprod, int32_t num_dims)
{
        int index = 0;
        int k;

        assert(coords != NULL);
        assert(cumprod != NULL);
        assert(num_dims > 0);

        for (k = 0; k < num_dims; k++)
        {
                index += coords[k] * cumprod[k];
        }

        return index;
}

static void ind_to_sub(int p, int num_dims, const int size[],
                       int *cumprod, int *coords)
{
        int j;

        assert(num_dims > 0);
        assert(coords != NULL);
        assert(cumprod != NULL);

        for (j = num_dims-1; j >= 0; j--)
        {
                coords[j] = p / cumprod[j];
                p = p % cumprod[j];
        }
}

And finally here is how to use it in order to see how it looks (example).

            cv::Rect roi(0, 0, gray.cols, gray.rows);
            cv::Mat dst(gray.rows, gray.cols, CV_32F);
            getLocalEntropyImage(gray, roi, dst);
            cv::normalize(dst, dst, 0, 255, cv::NORM_MINMAX);
            cv::Mat entropy;
            dst.convertTo(entropy, CV_8U);

Here @entropy is your image to show.

Example on quite nasty car picture. A lot of natural noise.

Upvotes: 1

zhen lee
zhen lee

Reputation: 333

I checked the source code for entropyfilt, which is in "entropyfilt.m".

It first pads the src mat and then calls entropyfiltmex.

We know entropyfiltmex is written in C++ code (reference to MEX file http://en.wikipedia.org/wiki/MEX_file) and can find these C++ source code files in the Matlab dirctory.

I have checked entroyfiltemex.cpp, the main logics is:

void local_entropy(_T *inBuf, double *outBuf){
  ......
  for (p = 0; p < numElements; p++)
        {           
            nhSetWalkerLocation(walker,p);

            // Get Idx into image
            while (nhGetNextInboundsNeighbor(walker, &n, NULL))
            {
                histCountPtr[(int) inBuf[n]]++;
            }

            // Calculate Entropy based on normalized histogram counts
            // (sum should equal one).
            for (k = 0; k < numBins;k++)
            {
                if (histCountPtr[k] != 0)
                {
                    temp = (double) histCountPtr[k] / numNeighbors;

                    // log base 2 (temp) = log(temp) / log(2)
                    entropy = temp * (log(temp)/log((double) 2));
                    outBuf[p] -= entropy;

                    //re-initialize for next neighborhood
                    histCountPtr[k] = 0;
                }
            }
        }
......
}

Here, nhSetWalkerLocation and nhGetNextInboundsNeighbor are Matlab neighbor operations.

According to the Matlab source code and great thanks for @B... ,I implemented a new version which improve in these aspects :

  1. Pads the image first
  2. Avoids invoking opencv cvCalHist() func, use a hist[256] to get histogram.
  3. Reuse matlab neighborhood operations to make points math fast.
  4. Use entropy_table to save log() result, which really makes a big difference(40s down to 3s).

Here's the code:

    void ImageProcess::GetLocalEntroyImage( const IplImage*gray_src,CvRect roi_rect,IplImage*local_entroy_image,IplImage*mask){
        using namespace cv;
        clock_t func_begin,func_end;
        func_begin=clock();
        //1.define nerghbood model,here it's 9*9
        int neighbood_dim=2;
        int neighbood_size[]={9,9};

        //2.Pad gray_src
        Mat gray_src_mat(gray_src);
        Mat pad_mat;
        int left=(neighbood_size[0]-1)/2;
        int right=left;
        int top=(neighbood_size[1]-1)/2;
        int bottom=top;
        copyMakeBorder(gray_src_mat,pad_mat,top,bottom,left,right,BORDER_REPLICATE,0);
        IplImage*pad_src=&IplImage(pad_mat);
        roi_rect=cvRect(roi_rect.x+top,roi_rect.y+left,roi_rect.width,roi_rect.height);

        //3.initial neighbood object,reference to Matlab build-in neighbood object system
        int element_num=roi_rect.width*roi_rect.height;
        //here,implement a histogram by ourself ,each bin calcalate gray value frequence
        int hist_count[256]={0};
        int neighbood_num=1;
        for(int i=0;i<neighbood_dim;i++)
            neighbood_num*=neighbood_size[i];
        //neighbood_corrds_array is a neighbors_num-by-neighbood_dim array containing relative offsets
        int*neighbood_corrds_array=(int*)malloc(sizeof(int)*neighbood_num*neighbood_dim);
        //Contains the cumulative product of the image_size array;used in the sub_to_ind and ind_to_sub calculations.
        int *cumprod;
        cumprod = (int *)malloc(neighbood_dim * sizeof(*cumprod));
        cumprod[0]=1;
        for(int i=1;i<neighbood_dim;i++){
            cumprod[i]=cumprod[i-1]*neighbood_size[i-1];
        }
        int*image_cumprod=(int*)malloc(2*sizeof(*image_cumprod));
        image_cumprod[0]=1;
        image_cumprod[1]=pad_src->width;
        //initialize neighbood_corrds_array
        int p;
        int q;
        int*coords;
        for(p=0;p<neighbood_num;p++){
            coords=neighbood_corrds_array+p*neighbood_dim;
            ind_to_sub(p, neighbood_dim, neighbood_size, cumprod, coords);
            for (q = 0; q < neighbood_dim; q++)
            {
                coords[q] -= (neighbood_size[q] - 1) / 2;
            }
        }
        //initlalize neighbood_offset in use of neighbood_corrds_array
        int*neighbood_offset=(int*)malloc(sizeof(int)*neighbood_num);
        int*elem;
        for(int i=0;i<neighbood_num;i++){
            elem=neighbood_corrds_array+i*neighbood_dim;
            neighbood_offset[i]=sub_to_ind(elem, image_cumprod,2);
        }

        //4.calculate entroy for pixel
        uchar*array=(uchar*)pad_src->imageData;
        //here,use entroy_table to avoid frequency log function which cost losts of time
        float entroy_table[82];
        const float log2=log(2.0f);
        entroy_table[0]=0.0;
        float frequency=0;
        for(int i=1;i<82;i++){
            frequency=(float)i/81;
            entroy_table[i]=frequency*(log(frequency)/log2);
        }
        int neighbood_index;
        int max_index=pad_src->width*pad_src->height;
        float temp;
        float entropy;
        int current_index=0;
        int current_index_in_origin=0;
        for(int y=roi_rect.y;y<roi_rect.height;y++){
            current_index=y*pad_src->width;
            current_index_in_origin=(y-4)*gray_src->width;
            for(int x=roi_rect.x;x<roi_rect.width;x++,current_index++,current_index_in_origin++){
                for(int j=0;j<neighbood_num;j++){
                    int offset=neighbood_offset[j];
                    neighbood_index=current_index+neighbood_offset[j];
                    hist_count[array[neighbood_index]]++;
                }
                //get entroy
                entropy=0;
                for(int k=0;k<256;k++){
                    if(hist_count[k]!=0){
                        int frequency=hist_count[k];
                        entropy -= entroy_table[hist_count[k]];
                        hist_count[k]=0;
                    }
                }
                ((float*)local_entroy_image->imageData)[current_index_in_origin]=entropy;
            }
        }
        func_end=clock();
        double func_time=(double)(func_end-func_begin)/CLOCKS_PER_SEC;
        cout<<"func time"<<func_time<<endl;
    }

The new version is much more fast now ,only take about 3s on the same image.

Note:

  1. The neighbood object in Matlab is really fancy. In fact, we can change this function interface to allow different kernel sizes. Do not have time now, so this's just a quick reuse.aha

Reference: [1]ftp://196.203.130.15/pub/logiciels/matlab2007/toolbox/images/images/private/entropyfiltmex.h [2]ftp://196.203.130.15/pub/logiciels/matlab2007/toolbox/images/images/private/neighborhood.cpp

Upvotes: 4

Bull
Bull

Reputation: 11941

The big slow down in your code is this: log(gray_frequency)/log(2.0f)).

You should not call cvNormalizeHist(). You know the bins are going to sum to 81, so just subtract 81 * log(81)/log(2) from the calculated entropy (but of course that is a constant not calcualted every time in your loop). If you don't normalize the hisgram, its entries will be integers and you can use them to access a lookup table.

Since you have a 9x9 kernel the maximum value of gray_frequency is 81 (as long as you don't normalize the histogram) and you can easily replace those two calls to log() by a single lookup of a precalculated table. This will make a huge difference. You can initialize a table like this:

    double entropy_table[82]; // 0 .. 81
    const double log2 = log(2.0);
    entropy_table[0] = 0.0;
    for(int i = 1; i < 82; i ++)
    {
        entropy_table[i] = i * log(double(i)) / log2;
    }

Then later it is just:

entroy -= entropy_table[gray_frequency];

Also you may find implementing your own histgram code is a win. E.g. if you have a small kernel you can keep track of which bins you are going to use and only clear those. But since you are using 81/256 bins this mightn't be worth it.

Another place you can get a speed up is in borrder pixel handling. You are checking this for every pixel. But oif you had separate loops for the boarder pixels and the inner pixels a re lot of calls to max and min could be avoided.

If that still isn't fast enough, you may consider using parallel_for on stripes. As a good example on how to do that, have a look at the source code for OpenCV's morphological filter.

Upvotes: 5

Related Questions