Timsort's merge pattern

Question

I would like to understand Timsort. Including all details, so an answer like "just call .sort() in Java or in Python" is not what I'm looking for.

I've already read the Wikipedia article, the Java implementation, the CPython implementation, the listsort.txt that these sources mention, and the publication listsort.txt references: Nearly-Optimal Mergesorts: Fast, Practical Sorting Methods That Optimally Adapt to Existing Runs. I've also skimmed through some of the references of Nearly-Optimal Mergesorts, namely Scaling and Related Techniques for Geometry Problems, A Unifying Look at Data Structures, and Fast Algorithms for Finding Nearest Common Ancestors; I wouldn't say I understood the latter three in full depth but I'm quite sure that they don't answer my question.

I grok most sub-algorithms: run counting, and reversing runs, binary insertion sort, merging (saving the head / tail part), galloping. Please, do not try to explain these, I know what, how, and why they do.

What I miss is the merge pattern. Actually, I understand how it works by maintaining a stack of runs and deciding when to merge based on repeatedly checking invariants; I also see that the goals are: adapting to natural runs, achieving sorting stability, balancing merges, exploiting temporal locality and keeping the algorithm simple. But I don't get why reestablishing these invariants results in an efficient algorithm.

The file listsort.txt states that (line 346):

The code now uses the "powersort" merge strategy from: "Nearly-Optimal Mergesorts: Fast, Practical Sorting Methods That Optimally Adapt to Existing Runs" J. Ian Munro and Sebastian Wild

I understand how Munro's and Wild's powersort works, and I find their explanation involving "nearly-optimal alphabetic trees" with "bisection heurestic" and "movement along the edges of Cartesian trees" sufficient.

What I can't grasp is the link between powersort's and Timsort's merge pattern.

Clearly, they are different:

powersort considers node powers, while Timsort takes into account runlengths,
powersort has one invariant: B <= A, while Timsort has two: Y > X and Z > Y + X (where X is the most recently read run, which has the highest starting index in the array and is on the top or will be placed on the top of the stack, and A is the node between runs Y and X),
powersort always merges Y and X, while Timsort merges either Y and X or Z and Y.

I see that a higher node power is usually between shorter runs but I cannot deduce one algorithm from the other. Please explain me what is the connection between the the two.

Once I understand this, I hope I will also find the justification of the stack capacities:

    /*
     * Allocate runs-to-be-merged stack (which cannot be expanded).  The
     * stack length requirements are described in listsort.txt.  The C
     * version always uses the same stack length (85), but this was
     * measured to be too expensive when sorting "mid-sized" arrays (e.g.,
     * 100 elements) in Java.  Therefore, we use smaller (but sufficiently
     * large) stack lengths for smaller arrays.  The "magic numbers" in the
     * computation below must be changed if MIN_MERGE is decreased.  See
     * the MIN_MERGE declaration above for more information.
     */
    int stackLen = (len <    120  ?  5 :
                    len <   1542  ? 10 :
                    len < 119151  ? 19 : 40);

How were these numbers calculated? Why is it sure that the stack won't grow beyond these? Why doesn't the code check if there is a free slot? If there isn't, then (a) expanding the stack or (b) merging at the top would easily prevent an ArrayOutOfBounds exception; but I didn't find anything like that in the code.

Timsort's merge pattern

Answers (1)

Related Questions

Timsort&#39;s merge pattern

Answers (1)

Related Questions

Timsort's merge pattern