Pey
Pey

Reputation: 35

What data structure is prefered instead of manipulating multiple vectors

I have implemented a class that makes computations on images. The processing is being done on a subset of the given images (lets say 100 out of 1000) at a time and each image takes a different number of iterations to finish. The processing uses GPUs and therefore it is not possible to use all the images at once. When the processing of an image is finished then this image is removed and another one is added. So I am using three different vectors image_outcome, image_index, image_operation to keep infromations about the images:

  1. The image_outcome is a std::vector<float> and each of its elements is a value that is used as a criterion to decide when the image is finished.
  2. The image_index is a std::vector<int> that holds the index of image in the original dataset.
  3. The image_operation is a std::vector<MyEnumValue> that holds the operation that is used to update the image_outcome. Is of an enum type and its value is one of many possible operations.

There are also two functions, one to remove the finished images and one to add as many images as removed (if there are still enough in the input).

  1. The remove_images() function takes all three matrices and the image matrix and removes the elements using the std::vector.erase().
  2. The add_images() takes again the three matrices and the image matrix adds new images and the relevant information to the vectors.

Because I am using an erase() on each vector with the same index (and also a similar way to add) I was thinking to:

  1. Use a private struct that has three vectors (nested struct).
  2. Use a private class that is implemented using three vectors (nested class).
  3. Use a different data-structure other than vec.

A hight-level example of the code can be fund below:

class ComputationClass {
  public:
    // the constructor initializes the member variables
    ComputationClass();
    void computation_algorithm(std::vector<cv::Mat> images);

  private:
    // member variables which define the algorithms parameters
    // add_images() and remove_images() functions take more than these
    // arguments, but I only show the relevant here
    add_images(std::vector<float>&, std::vector<int>&, std::vector<MyEnumValue>&);
    remove_images(std::vector<float>&, std::vector<int>&, std::vector<MyEnumValue>&);
};

void ComputationClass::computation_algorithm(std::vector<cv::Mat> images) {
  std::vector<float> image_output; 
  std::vector<int> image_index;
  std::vector<MyEnumValue> image_operation;

  add_images(image_output, image_index, image_operation);

  while (there_are_still_images_to_process) {
    // make computations by updating the image_output vector
    // check which images finished computing
    remove_images(image_output, image_index, image_operation);
    add_images(image_output, image_index, image_operation);
  }
}

Upvotes: 2

Views: 89

Answers (1)

Sgene9
Sgene9

Reputation: 186

I think, instead of a struct with 3 vectors, a single vector of user-defined objects would work better.

std::vector<MyImage> images;

class MyImage {
    Image OImage; // the actual image
    float fOutcome;
    int dIndex;
    MyEnumValue eOperation;
    bool getIsDone() {
        return fOutcome > 0; // random condition
    }
}

You can add to vector or erase from vector with a condition

if( (*it).getIsDone() ) {
    VMyVector.erase( it );
} 

In my opinion, maintaining 3 vectors that go parallel is easy to make mistakes and hard to modify.

Upvotes: 2

Related Questions