Sorting algorithm times using sorting methods

Question

So I just learned about sorting algorithm s bubble, merge, insertion, sort etc. they all seem to be very similar in their methods of sorting with what seems to me minimal changes in their approach. So why do they produce such different sorting times ie O(n^2) vs O(nlogn) as an example

Alex Martelli · Accepted Answer

The "similarity" (?!) that you see is completely illusory.

The elementary, O(N squared), approaches, repeat their workings over and over, without taking any advantage, for the "next step", of any work done on the "previous step". So the first step takes time proportional to N, the second one to N-1, and so on -- and the resulting sum of integers from 1 to N is proportional to N squared.

For example, in selection sort, you are looking each time for the smallest element in the I:N section, where I is at first 0, then 1, etc. This is (and must be) done by inspecting all those elements, because no care was previously taken to afford any lesser amount of work on subsequent passes by taking any advantage of previous ones. Once you've found that smallest element, you swap it with the I-th element, increment I, and continue. O(N squared) of course.

The advanced, O(N log N), approaches, are cleverly structured to take advantage in following steps of work done in previous steps. That difference, compared to the elementary approaches, is so pervasive and deep, that, if one cannot perceive it, that speaks chiefly about the acuity of one's perception, not about the approaches themselves:-).

For example, in merge sort, you logically split the array into two sections, 0 to half-length and half-length to length. Once each half is sorted (recursively by the same means, until the length gets short enough), the two halves are merged, which itself is a linear sub-step.

Since you're halving every time, you clearly need a number of steps proportional to log N, and, as each step is O(N), obviously you get the very desirable O(N log N) as a result.

Python's "timsort" is a "natural mergesort", i.e, a variant of mergesort tuned to take advantage of already-sorted (or reverse-sorted) parts of the array, which it recognizes rapidly and avoids spending any further work on. This doesn't change big-O because that's about worst-case time -- but expected time crashes much further down because in so many real-life cases some partial sortedness is present.

(Note that, going by the rigid definition of big-O, quicksort isn't quick at all -- it's worst-case proportional to N squared, when you just happen to pick a terrible pivot each and every time... expected-time wise it's fine, though nowhere as good as timsort, because in real life the situations where you repeatedly pick a disaster pivot are exceedingly rare... but, worst-case, they might happen!-).

timsort is so good as to blow away even very experienced programmers. I don't count because I'm a friend of the inventor, Tim Peters, and a Python fanatic, so my bias is obvious. But, consider...

...I remember a "tech talk" at Google where timsort was being presented. Sitting next to me in the front row was Josh Bloch, then also a Googler, and Java expert extraordinaire. Less than mid-way through the talk he couldn't resist any more - he opened his laptop and started hacking to see if it could possibly be as good as the excellent, sharp technical presentation seemed to show it would be.

As a result, timsort is now also the sorting algorithm in recent releases of the Java Virtual Machine (JVM), though only for user-defined objects (arrays of primitives are still sorted the old way, quickersort [*] I believe -- I don't know which Java peculiarities determined this "split" design choice, my Java-fu being rather weak:-).

[*] that's essentially quicksort plus some hacks for pivot choice to try and avoid the poison cases -- and it's also what Python used to use before Tim Peters gave this one immortal contribution out of the many important ones he's made over the decades.

The results are sometimes surprising to people with CS background (like Tim, I have the luck of having a far-ago academic background, not in CS, but in EE, which helps a lot:-). Say, for example, that you must maintain an ever-growing array that is always sorted at any point in time, as new incoming data points must get added to the array.

The classic approach would use bisection, O(log N), to find the proper insertion point for each new incoming data point -- but then, to put the new data in the right place, you need to shift what comes after it by one slot, that's O(N).

With timsort, you just append the new data point to the array, then sort the array -- that's O(N) for timsort in this case (as it's so awesome in exploiting the already-sorted nature of the first N-1 items!-).

You can think of timsort as pushing the "take advantage of work previously done" to a new extreme -- where not only work previously done by the algorithm itself, but also other influences by other aspects of real-life data processing (causing segments to be sorted in advance), are all exploited to the hilt.

Then we could move into bucket sort and radix sort, which change the plane of discourse -- which in traditional sorting limits one to being able to compare two items -- by exploiting the items' internal structure.

Or a similar example -- presented by Bentley in his immortal book "Programming Pearls" -- of needing to sort an array of several million unique positive integers, each constrained to be 24 bits long.

He solved it with an auxiliary array of 16M bits -- just 2M bytes after all -- initially all zeroes: one pass through the input array to set the corresponding bits in the auxiliary array, then one pass through the auxiliary array to form the required integers again where 1s are found -- and bang, O(N) [and very speedy:-)] sorting for this special but important case!-)

Sorting algorithm times using sorting methods

Answers (1)

Related Questions