Reputation: 1062
Today I ran into the problem that accessing the vector elements slowed down with the size of the vector. As it is not my code, I cannot post it, so please bear with me. I will try to describe it as detailed as possible.
The functionality of the code is as follows:
1. a Dataset class, takes a .txt file, which contains file names. These point to standard png images which need to be loaded. This is done by an Image<T>
class. The images are loaded as Image<unsigned char>
and pushed back into a std::Vector
.
2. After the loading of the data has been done. I can access the vector in my dataset in order to work with it. So it looks something like this:
Dataset d;
d.init("filenames_list.txt"); //Loads the images
for(int i=0; i< d.getDatavector().size(); i++){
Image<unsigned char> current = d.getDatavector()[i];
//Do work on current image here.
}
Here getDatavector() will return a std::Vector<Image<unsigned char> >
. The images hold three ints, for width, height and the number of channels and furthermore a Boost shared pointer that points to the interleaved data.
For small testruns, I have a list of files which contains about 150 images. Running the program with this works fine and speed measurements tell me that
Image<unsigned char> current = d.getDatavector()[i];
takes about 10ms to be completed. If however I want to work on my full dataset of 1500 images, the above line takes about 500ms to complete. I've tried to do many different things to fix it, but I am somewhat limited by the general structure of the code and by the memory. Because if I do the following:
const std::Vector<Image<unsigned char> > data = d.getDatavector();
before the loop, it runs very fast, but I soon run out of memory.
I know my problem description is somewhat vague, and I am not hoping for the exact solution, but I am hoping for some tips on where to look. I searched for similar problems, but people only seem concerned with the general speed of vectors versus arrays. My problem is though, that the speed degrades with the length of the vector! If someone has seen this kind of problem, any suggestions are very much welcome!
So far I have tried accessing the content using the std::vector::iterator or using (d.getDatavector().data()) as a pointer. Nothing seems to improve the speed of it.
Upvotes: 1
Views: 375
Reputation: 710
As already mentioned, the root of the problem seems to be in that getDatavector()
returns a full copy of the vector, and the solution would be to return a reference (or a pointer instead).
You also have a similar problem with Image<unsigned char> current = ...
where a copy of an image is being made as well.
One solution to these problems would be to use a direct access to the image as:
Image<unsigned char>* getImage(int idx)
{
if (idx < _myVector.size())
{
return &_myVector[idx].Image;
}
return NULL;
}
Edit: version returning reference
Image<unsigned char>& getImage(int idx)
{
if (idx < _myVector.size())
{
return _myVector[idx].Image;
}
// throw exception here;
}
Obviously this will not work if you must have a copy of each image.
Upvotes: 1
Reputation: 109289
What does the signature of getDataVector()
look like? Is it
std::vector<Image<unsigned char>> getDataVector();
If so, the function is returning the vector
by value, and every time you write d.getDatavector()[i]
a copy of the vector
is made, the i
element is copied out of the vector
and then the vector
itself is destroyed.
If you can modify the Dataset
class change the function to
std::vector<Image<unsigned char>> const& getDataVector();
Now copies won't be made every time the function is called.
If you're unable to modify the class, make a single copy before entering the loop, and then use the local variable within the loop.
It is impossible for the problem to be the indexing since std::vector
's underlying data array is required to be contiguous and so accessing the ith element is as simple as adding i
to the pointer marking the start address of the data array and dereferencing the result.
Upvotes: 5
Reputation: 11031
The reason is that you return vector by value in cycle.
Make your getDatavector()
to return a std::Vector<Image<unsigned short> >&
or std::Vector<Image<unsigned short> > const&
not std::Vector<Image<unsigned short> >
Upvotes: 5
Reputation: 16700
Are you using C++11, or an earlier C++?
If an earlier C++11, and getDataVector returns a vector, then it may have to be copied. If you're using C++11, it can be moved into the return variable w/o copying
that could be the source of your slowdown.
Accessing an element of a vector is a constant-time operation.
Upvotes: 2