Reputation: 3379
I've taken a look at a few implementations of merge sort in java, and it seems like the splitting part is commonly done by creating 2 new arrays, left and right and copying the left and right parts to those arrays respectively.
My question is: This still gives us nlogn time right? Because now each split would have n+n time (n time for splitting the array, n time for merging the array) and there are logn splits, so we have 2nlogn which is still nlogn. Is this correct?
Upvotes: 1
Views: 307
Reputation: 48864
Which implementations were you looking at? The primary sorting algorithms used by the JDK are DualPivotQuicksort
(for primitive arrays), TimSort
and ComparableTimSort
(for Object
arrays/collections), and the incomprehensible wizardy in ArraysParallelSortHelpers
for concurrent sorts.
All of these classes provide highly-tuned implementations of QuickSort or MergeSort (or in the parallel case "CilkSort", which sounds like a concurrency-friendly MergeSort). These algorithms are all generally O(n log(n)) - it's impossible to do better than O(n log(n)) without additional constraints on your inputs.
As to whether these algorithms generally spend O(n) time copying values back-and-forth between temporary storage, it's very much a case-by-case question. Where possible these implementations will certainly try to avoid using O(n) extra memory and time to do linear copying, but oftentimes such work is not actually the bottleneck. For example CilkSort uses a secondary "workspace array" to copy values back and forth from in successive stages, which makes it easier to enforce invariants since the "prior" stage is always in a reliable state. Even so the implementations try to avoid unnecessary array-copies whenever possible.
At the public API level we'll notice that Collections.sort()
calls .toArray()
, sorts the array, and then copies the values back into the original List
. This is similarly just a practical optimization - it's faster to sort an array and do two linear copies than to deal with all the method invocations (and potentially inefficient List
implementations, such as LinkedList
) of modifying the List
in-place.
Upvotes: 1