TheIE100
TheIE100

Reputation: 77

Time complexity of built-in 'sort' in Clojure

I want to know complexity (Big-O notation) of the built-in 'sort' function that the Clojure programming language provides, I did my search about it on the clojuredocs page but I did not find anything about it.

Thanks in advance.

Upvotes: 1

Views: 798

Answers (2)

Carcigenicate
Carcigenicate

Reputation: 45726

Looking at the source, you can see that it just delegates to java.util.Arrays.sort:

(defn sort
  ([coll]
   (sort compare coll))
  ([^java.util.Comparator comp coll]
   (if (seq coll)
     (let [a (to-array coll)]
       (. java.util.Arrays (sort a comp)) ; Here
       (seq a))
     ())))

Apparently java.util.Arrays.sort uses a Timsort, which according to Wikipedia has a worst case runtime of O(n log n).


To verify, following the ctrl+B rabbit hole in IntelliJ, I eventually end up at the sort with the signature:

public static <T> void sort(T[] a, Comparator<? super T> c)

And in the body of that function is this bit:

...
if (LegacyMergeSort.userRequested)
    legacyMergeSort(a, c);
else
    TimSort.sort(a, 0, a.length, c, null, 0, 0);
...

So it does appear to use a Timsort unless the user has requested it to use a legacy Merge Sort implementation instead.

Upvotes: 5

coredump
coredump

Reputation: 38789

The sort built-in actually calls java.util.Arrays.sort:

(defn sort
  "Returns a sorted sequence of the items in coll. If no comparator is
  supplied, uses compare.  comparator must implement
  java.util.Comparator.  Guaranteed to be stable: equal elements will
  not be reordered.  If coll is a Java array, it will be modified.  To
  avoid this, sort a copy of the array."
  {:added "1.0"
   :static true}
  ([coll]
   (sort compare coll))
  ([^java.util.Comparator comp coll]
   (if (seq coll)
     (let [a (to-array coll)]
       (. java.util.Arrays (sort a comp))
       (seq a))
     ())))

The Java sort on generic Object values has the following comment (emphasis added):

Implementation note: This implementation is a stable, adaptive, iterative mergesort that requires far fewer than n lg(n) comparisons when the input array is partially sorted, while offering the performance of a traditional mergesort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons. Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays.

The implementation takes equal advantage of ascending and descending order in its input array, and can take advantage of ascending and descending order in different parts of the the same input array. It is well-suited to merging two or more sorted arrays: simply concatenate the arrays and sort the resulting array.

The implementation was adapted from Tim Peters's list sort for Python (TimSort). It uses techniques from Peter McIlroy's "Optimistic Sorting and Information Theoretic Complexity", in Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 467-474, January 1993.

However, there are other overloaded methods, for primitive types, like for int[], for which the following holds:

The sorting algorithm is a Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm offers O(n log(n)) performance on many data sets that cause other quicksorts to degrade to quadratic performance, and is typically faster than traditional (one-pivot) Quicksort implementations.

Complexity also depends on the complexity of the custom comparison algorithm.

Upvotes: 7

Related Questions