memC
memC

Reputation: 1025

performance: sorting 'm' vectors with N/m elems Vs sorting single vector with N elements

Operation A

I have N vectors, each containing certain number of unique 3D points. For Example : std::vector<double*> vec1; and like that

I am performing sort operation on each of the vector like:

 std::sort(vec1.begin(), vec1.end(), sortCriteria());
 std::sort(vec2.begin(), vec2.end(), sortCriteria());
 std::sort(vec3.begin(), vec3.end(), sortCriteria());

Operation B

Suppose I have a vector called "all_point_vector" which holds the 3D points from vec1, vec2, vec3 ...

i.e. 3D points in all_point_vector = points_in_vec1 +.... +points_in_vector3.

and I am performing the sort operation:

std::sort(all_point_vec.begin(), all_point_vec.end(), sortCriteria());

My question is , which of the above methods (Operation A or B) will be faster in general? sorting a single vector (all_point_vector) or sorting individual vectors. I am just interested in the speed of execution of these two operations.

Thanks

Upvotes: 3

Views: 291

Answers (3)

Mike Dunlavey
Mike Dunlavey

Reputation: 40659

Sorting a single array of m elements is a different problem from sorting the same number of elements divided into N arrays, because in the divided-case, you still don't have a total order of all the elements.

Assuming m = 1024, in the singleton case, m log m = 1024*10 = 10240.

If N=2 you have 512*9 + 512*9 = 9216, but you still have to do a merge step of 1024 comparisons, and 9216 + 1024 = 10240, so it's the same.

[Actually, at each level of the sorting, the number of comparisons is 1 less than the number of items to merge, but the overall result is still O(n log n)]

ADDED: If, as you commented, you don't need to do the merge, then the divided case is faster. (Of course, in that case, you could divide the m items into N=m arrays and not even bother sorting ;-)

Upvotes: 1

sbk
sbk

Reputation: 9508

What avakar said, in theory sorting a few short vectors should be faster than sorting the whole, in practice - you should measure. I'd just like to show some more math:

Let there be k sequences and the i-th sequence has ni number of elements. Let the total number of elements be N = n1 + ... + nk. Sorting the individual sequences has complexity O(n1logn1 + ... + nklognk). Sorting the big sequence has complexity O(N logN) = O((n1 + ... + nk)logN) = O(n1logN + ... + nklogN). Now we have to compare

A = n1logn1 + ... + nklognk

B = n1logN + ... + nklogN

Since N > ni for all i, logN > logni for all i. Therefore, B is strictly larger than A, i.e. sorting the entire sequence will take more time.

Upvotes: 3

avakar
avakar

Reputation: 32635

Sorting is an O(n log n) operation. Sorting N vectors with m/N elements will become strictly faster than sorting a single vector of m elements as you increase m.

Which one is faster for any fixed m can only be determined by profiling.

Upvotes: 4

Related Questions