stackman
stackman

Reputation: 360

Performance related to "imperative" algorithms in haskell

I have some training in the lisp family of languages and I am now learning a bit of Haskell for my own good. In lisp, functional style is ok but there are a few cases where imperative style is necessary to get decent performance, e.g.

My question is, how do we deal with these problems in haskell ?

Upvotes: 3

Views: 502

Answers (2)

Thomas M. DuBuisson
Thomas M. DuBuisson

Reputation: 64740

As delnan said, these problems are with using the wrong data structure, such as a linked list when you want a vector.

Your particular operations

  • append: cons is O(1), so I suspect you are referring to the act of allocating a cons cell, pattern matching on the cell, and eventual GC of the cell. This is rather unavoidable unless you optimize away the list, which does happen in certain situations.

  • sort: I suggest you look at the ST monad, which allows you to use algorithms that require mutation inside a pure context. See the vsort implementation for an example.

  • length: Sure, the only way to get the length of a linked list is to traverse the list. If you need the length often then consider using a Set, Map, or Vector - all of which record a total size that can be accessed in O(1).

In General

  • Look into ST for mutability.
  • Use the right data structure for the right problem. Learn what structures Hackage has to offer (vectors, arrays, hashmaps, hashtables, bloomfilters, and more).

Upvotes: 11

J. Abrahamson
J. Abrahamson

Reputation: 74364

append isn't a universal thing. Appending on to a double-ended queue is different from appending on to a cons list. Appending destructively is different from copy-on-append. Depending on your criteria, you may optimize for slowness or thread safety or simplicity and your definition of "problem" will change.

As delnan and Thomas DuBuisson mention Haskell has all of these options if you pick the right data type.

So if you have a specific algorithm you'd like to optimize, there's likely to be either a method to translate it to a fast non-destructive version, an option to simulate destructions at usually a multiplicative log(n) slowdown, or an option to run referentially transparent destructions at an additive constant slowdown.

For good examples of trouble with this translation look at algorithms for persistent union-find or the Functional Graph Library.

Upvotes: 3

Related Questions