Reputation: 103
I got a decision making problem here. In my application, I need to merge two vectors. I can't use stl algorithms since data order is important (It should not be sorted.).
Both the vectors contains the data which can be same sometimes or 75% different in the worst case.
Currently I am confused b/w two approaches,
Approach 1:
a. take an element in the smaller vector.
b. compare it with the elements in bigger one.
c. If element matches then skip it (I don't want duplicates).
d. If element is not found in bigger one, calculate proper position to insert.
e. re-size the bigger one to insert the element (multiple time re-size may happen).
Approach 2:
a. Iterate through vectors to find matched element positions.
b. Resize the bigger one at a go by calculating total size required.
c. Take smaller vector and go to elements which are not-matched.
d. Insert the element in appropriate position.
Kindly help me to choose the proper one. And if there is any better approach or simpler techniques (like stl algorithms), or easier container than vector, please post here. Thank you.
Upvotes: 0
Views: 507
Reputation: 59174
You shouldn't be focusing on the resizes. In approach 1, you should use use vector.insert() so you don't actually need to resize the vector yourself. This may cause reallocations of the underlying buffer to happen automatically, but std::vector is carefully implemented so that the total cost of these operations will be small.
The real problem with your algorithm is the insert, and maybe the search (which you didn't detail). When you into a vector anywhere except at the end, all the elements after the insertion point must be moved up in memory, and this can be quite expensive.
If you want this to be fast, you should build a new vector from your two input vectors, by appending one element at a time, with no inserting in the middle.
Upvotes: 2
Reputation: 13
Depending on your actual setup (like if you're adding object pointers to a vector instead of copying values into one), you might get significantly faster results using a std::list. std::list allows for constant time insertion which is going to be a huge performance overhead.
Doing insertion might be a little awkward but is completely do-able by only changing a few pointers (inexpensive) vs insertion via a vector which moves every element out of the way to put the new one down.
If they need to end up as vectors, you can then convert the list to a vector with something like (untested)
std::list<thing> things;
//efficiently combine the vectors into a list
//since list is MUCH better for inserts
//but we still need it as a vector anyway
std::vector<thing> things_vec;
things_vec.reserve(things.size()); //allocate memory
//now move them into the vector
things_vec.insert(
things_vec.begin(),
std::make_move_iterator(things.begin()),
std::make_move_iterator(things.end())
);
//things_vec now has the same content and order as the list with very little overhead
Upvotes: 0
Reputation: 71
Doesn't look like you can do this in better time complexity than O(n.log(n)) because removing duplicates from a normal vector takes n.log(n) time. So using set to remove duplicates might be the best thing you can do. n here is number of elements in both vectors.
Upvotes: 0