Aly
Aly

Reputation: 16285

C++: OpenCV performance issues scanning images

I have about 20 images which have been colour coded. I want to scan each image and match the pixel to the label associated with that colour. I have written the code below, however it takes about 30 mins to run to do this seemingly simple task. The images have resolution of 960 x 720.

My Code:

void go_through_pixels(path &image_dir, string& ground_truth_suffix, string image_format, unordered_map<RGB, string> colors_for_labels){

    if(!exists(image_dir)){
        cerr << image_dir << " does not exist, prematurely returning" << endl;
        exit(-1);
    }

    unordered_map<string, set<path> > label_to_files_map;

    //initialise label_to_files_map
    for(unordered_map<RGB, string>::iterator it = colors_for_labels.begin(); it != colors_for_labels.end(); it++){
        label_to_files_map[it->second] = set<path>();
    }

    directory_iterator end_itr; //default construction provides an end reference

    for(directory_iterator itr(image_dir); itr != end_itr; itr++){

        path file = itr->path();
        string filename = file.filename().string();
        RGB rgb(0,0,0); //default rgb struct, values will be changed in the loop

        if(extension(file) == image_format && filename.find(ground_truth_suffix) != string::npos){
            //ground truth file
            Mat img = imread(file.string(), CV_LOAD_IMAGE_COLOR);

            for(int y = 0; y < img.rows; y++){
                for(int x = 0; x < img.cols; x++){
                    //gives data as bgr instead of rgb
                    Point3_<uchar>* pixel = img.ptr<Point3_<uchar> >(y,x);
                    rgb.red = (int)pixel->z;
                    rgb.green = (int)pixel->y;
                    rgb.blue =(int)pixel->x;
                    string label = colors_for_labels[rgb];
                    label_to_files_map[label].insert(file);
                    cout << label << endl;
                }
            }
        }
    }
}

I will be doing more with this data afterwards, but have simplified my code down to this just to try and find the performance issue.

I have found that the label_to_files_map[label].insert(file) is causing most of the delay, as when removed it takes about 3 mins to just scan the images. I still think this is too long, but may be wrong?

Also, as the set insert is taking a long time (as it has to check for duplicate before insertion) can anyone suggest a better data structure to use here?

Essentially a picture can have lets say 100 pixels corresponding to a building, 100 corresponding to a car and so on so I just want to record in the map label_to_files_map that this file (the current image being scanned) has a building in it (which in this case is denoted by a particular rgb value).

Upvotes: 2

Views: 1950

Answers (3)

remi
remi

Reputation: 3988

Additional to the other answers with respect to code optimisation, consider working on the image histogram. Several pixels in your image will have exactly the same color so compute the histogram first, then do your processing for every distinct color in your image. That should speed things up greatly

Upvotes: 0

Sam
Sam

Reputation: 20056

You are using the wrong data types and the wrong functions. Here is a suggestion on how to improve. I suppose it will run in several seconds.

Step 1 of your work is a lookup table from a 3-channel image to a single channel image. you can use cv::LUT. However, you need to do a trick in order to make it fast.

Convert it to 4 bytes per pixel:

cv::Mat mat4bytes;
// add 8 bits to each pixel. the fill value is 255
cv::cvtColor(img, mat4bytes, CV_RGB2RGBA); 
// this is a nice hack to interpret 
// the RGBA pixels of the input image as integers 
cv::Mat pseudoInteger(img.size(), CV_32UC1, mat4bytes.data);

Now, you can apply LUT.

cv::Mat colorCoded;
// you have to convert your colors_for_labels lookup table
// like this: 
lookupTable[i] = 
      ((unsigned int)colors_for_labels.first.x << 24 ) + 
      ((unsigned int)colors_for_labels.first.y << 16 ) +        
      ((unsigned int)colors_for_labels.first.z << 8  ) +        
      255; 
// make sure it is correct!!!
// and lookupTable data MUST be unsigned integer

cv::LUT(pseudoInteger, colorCoded, lookupTable);

EDIT At this point you have in lookupTable the values you calculate in label

The final step of your calculation is actually a histogram. So why don't you use the histogram functions from OpenCV? check the docs for calcHist(), and see how it best fits your algorithm. Note that calcHist() can perform the histogram of more image at once, so you may want to keep the colorCoded images in a vector, then extract the histogram of all of them in one.

Upvotes: 1

Louis Ricci
Louis Ricci

Reputation: 21106

The performance issue is your doing too much work per pixel.

For each file (right before your stacked for-loops start) make a copy of color_for_labels.

        Point3_<uchar> oldPixel;
        for(int y = 0; y < img.rows; y++){
            for(int x = 0; x < img.cols; x++){
                //gives data as bgr instead of rgb
                Point3_<uchar>* pixel = img.ptr<Point3_<uchar> >(y,x);
                if(*pixel == oldPixel) 
                    continue; // skip extra work
                oldPixel = *pixel
                rgb.red = (int)pixel->z;
                rgb.green = (int)pixel->y;
                rgb.blue =(int)pixel->x;
                string label = copy_of_colors_for_labels[rgb];
                if(label != null) {
                    label_to_files_map[label].insert(file);
                    copy_of_colors_for_labels[rgb] = null;
                    cout << label << endl;
                }
            }
        }

There might be syntax errors (because I re-wrote it in the browser and haven't coded in C++ in a number of years) but the above should cull away a lot of extra processing work.

Upvotes: 3

Related Questions