Reputation:

When to use merge sort and when to use quick sort?

Both articles have excellent visualizations.

Both have n*log(n) complexity.

So obviously the distribution of the data will effect the speed of the sort. My guess would be that since a comparison can just as quickly compare any two values, no matter their spread, the range of data values does not matter.

More importantly one should consider the lateral distribution (x direction ) with respect to ordering (magnitude removed).

A good test case to consider would be if the test data had some level of sorting...

Upvotes: 18

Answers (6)

pschang

Reputation: 2618

Mergesort is quicker when dealing with linked lists. This is because pointers can easily be changed when merging lists. It only requires one pass (O(n)) through the list.

Quicksort's in-place algorithm requires the movement (swapping) of data. While this can be very efficient for in-memory dataset, it can be much more expensive if your dataset doesn't fit in memory. The result would be lots of I/O.

These days, there is a lot of parallelization that occurs. Parallelizing Mergesort is simpler than Quicksort (in-place). If not using the in-place algorithm, then the space complexity for quicksort is O(n) which is the same are mergesort.

So, to generalize, quicksort is probably more effective for datasets that fit in memory. For stuff that's larger, it's better to use mergesort.

The other general time to use mergesort over quicksort is if the data is very similar (that is, not close to being uniform). Quicksort relies on using a pivot. In the case where all the values are the similar, quicksort hits a worst case of O(n^2). If the values of the data are very similar, then it's more likely that a poor pivot will be chosen leading to very unbalanced partitions leading to an O(n^2) runtime. The most straightforward example is if all the values in the list are the same.

Upvotes: 12

DarthVader

Reputation: 55022

While Java 6 and earlier versions use merge sort as the sorting algorithms, C# uses QuickSort as the sorting algorithm.

QuickSort performs better than merge sort even though they are both O(nlogn). QuickSort's has a smaller constant than merge sort.

Upvotes: 5

Steve Jessop

Reputation: 279225

Of the two, use merge sort when you need a stable sort. You can use a modified quicksort (such as introsort) when you don't, since it tends to be faster and it uses less memory.

Plain old Quicksort as described by Hoare is quite sensitive to performance-killing special cases that make it Theta(n^2), so you normally do need a modified version. That's where the data-distribution comes in, since merge sort doesn't have bad cases. Once you start modifying quicksort you can go on with all sorts of different tweaks, and introsort is one of the more effective ones. It detects on the fly whether it's in a killer case, and if so switches to heapsort.

In fact, Hoare's most basic Quicksort fails worst for already-sorted data, and so your "good test cases" with some level of sorting will kill it to some level. That fact is for curiosity only, though, since it only takes a very small tweak to avoid that, nothing like as complicated as going all the way to introsort. So it's simplistic to even bother analyzing the version that's killed by sorted data.

In practice, in C++ you'd generally use std::stable_sort and std::sort rather than worrying too much about the exact algorithm.

Upvotes: 5

James Kanze

Reputation: 153909

It typically depends on the data structures involved. Quick sort is typically the fastest, but it doesn't guarantee O(n*log(n)); there are degenerate cases where it becomes O(n^2). Heap sort is the usual alternative; it guarantees O(n*log(n)), regardless of the initial order, but it has a much higher constant factor. It's usually used when you need a hard upper limit to the time taken. Some more recent algorithms use quick sort, but attempt to recognize when it starts to degenerate, and switch to heap sort then. Merge sort is used when the data structure doesn't support random access, since it works with pure sequential access (forward iterators, rather than random access iterators). It's used in std::list<>::sort, for example. It's also widely used for external sorting, where random access can be very, very expensive compared to sequential access. (When sorting a file which doesn't fit into memory, you might break it into chunks which fit into memory, sort these using quicksort, writing each out to a file, then merge sort the generated files.)

Upvotes: 18

NPE

Reputation: 500247

There is a real-world sorting algorithm -- called Timsort -- that does exploit the idea that data encountered in the wild is often partially sorted.

The algorithm is derived from merge sort and insertion sort, and is used in CPython, Java 7 and Android.

See the Wikipedia article for more details.

Upvotes: 6

Peter

Reputation: 29837

Remember in practice, unless you have a very large data set and/or are executing the sort many many times, it probably won't matter at all. That being said, quicksort is generally considered the 'fastest' n*log(n) sorter. See this question already asked: Quick Sort Vs Merge Sort

Upvotes: 1

When to use merge sort and when to use quick sort?

Answers (6)

Related Questions