Internal representation of Haskell lists?

Question

Haskell supports some basic operations for recursing through lists, like head, tail, init and last. I'm wondering, internally, how Haskell is representing its list data? If it's a singly-linked list, then init and last operations could become costly as the list grows. If it's a doubly-linked list, all four operations could be made O(1) quite easily, albeit at the expense of some memory. Either way, it's important for me to know, so I can write appropriate code. (although, the ethos of functional programming seems to be one of "ask what it does, not how it does it").

Don Stewart · Accepted Answer

Lists are represented as ... singly linked lists. The definition is given by:

data [] a = [] | a : [a]

which you could write as:

data List a = Empty | Cons a (List a)

The memory layout is entirely defined by this.

Constructors are heap allocated
Internal polymorphic fields are pointers to other allocated nodes
The spine is lazy

So you end up with something like this:

enter image description here

So head is O(1) on this structure, while last or (++) is O(n)

There is no magic to data structures in Haskell - their straight-forward definition makes entirely clear what the complexity will be (modulo laziness). If you need different complexity, use a different structure (such as IntMap, Sequence, HashMap, Vector etc)...

Internal representation of Haskell lists?

Answers (2)

Related Questions