Abhishek Anand
Abhishek Anand

Reputation: 3838

fastest sorting in functional style

As an undergrad, I studied O(n log n) algorithms for sorting and proofs that we cannot do better for the general case when all we can do is compare 2 numbers. That was for a random access memory model of computation. I want to know if such theoretic lowerbounds exists for functional style(referentially transparent) programs. Let us assume that each beta reduction counts as one step, and each comparison counts as one step.

Upvotes: 3

Views: 508

Answers (2)

jJ'
jJ'

Reputation: 3078

So let us assume that you agree with the proof that we cannot do better than (n*log n) in a general case where the only thing we have is a comparison.

So the question is whether we can show that we can do the same without side effects.

One way to do that uses the idea that if you can build an immutable binary search tree in O(n*log n) and then infix-traverse it (can be done in O(n)), then we would have a sorting algorithm.

If we can go through all items and add every item to a balanced immutable (persistent) tree in O(log n) it would give us O(n*log n) algorithm.

Can we add to a persistent binary tree in O(log n)? Sure, there are immutable variants of balanced binary search trees with O(log n) insert in every reasonable implementation of persistent data structure library.

To get an idea why it is possible, imagine a standard balanced binary search tree as e.g. red-black tree. You can make immutable version of it by following the same algorithm as for the mutable one, except whenever pointers or color change, you need to allocate a new node and consequently all of its parents to the root (while simultaneously transforming them too if necessary). Side branches that do not change get reused. There are at most O(log n) affected nodes, so at most O(log n) operations (including allocations) per insertion. If you know red-black, you can see that there are no other multipliers to this except for constants (for rotations you can get a few extra allocations for affected siblings, but that still remains a constant factor).

This - quite informal - demonstration can give you an idea that a proof for O(n*log n) for sorting without side effects exists. However, there are a few more things that I left out. E.g. allocation is considered to be O(1) here, which may not be a case always, but that would get too complex.

Upvotes: 1

Jonathan Schneider
Jonathan Schneider

Reputation: 27727

I think modern functional programming implementations (at least Clojure, since that's the only one I know) do have immutable memory, but that doesn't mean that altering lists and such result in copying of the whole original list. As such, I don't believe there are order-of-magnitude level computational differences between implementing sorting algorithms with imperative or functional idioms.

More on how immutable lists can be modified without memory copying:

For an example of how this might work, consider this snippet from the Clojure reference at: http://clojure-doc.org/articles/tutorials/introduction.html

...In Clojure, all scalars and core data structures are like this. They are values. They are immutable.

The map {:name "John" :hit-points 200 :super-power :resourcefulness} is a value. If you want to "change" John's hit-points, you don't change anything per se, but rather, you just conjure up a whole new hashmap value.

But wait: If you've done any imperative style programming in C-like languages, this sounds crazy wasteful. However, the yin to this immutability yang is that --- behind the scenes --- Clojure shares data structures. It keeps track of all their pieces and re-uses them pervasively. For example, if you have a 1,000,000-item list and want to tack on one more item, you just tell Clojure, "give me a new one but with this item added" --- and Clojure dutifully gives you back a 1,000,001-item list in no time flat. Unbeknownced to you it's re-using the original list.

So why the fuss about immutability?

I don't know the complete history of functional programming, but it seems to me that the immutability characteristic of functional languages essentially abstracts away the intricacies of shared memory.

While this is cool, without a language-managed shared data structure underpinning the whole mechanism, it would be impractically slow for many use cases.

Upvotes: 0

Related Questions