thor
thor

Reputation: 22520

does python have a list constructor?

Does python have a list constructor like the cons in OCaml (::) (or lisp) which takes a head element and tail list, and returns a new list head::tail?

I searched for python list constructors, and ended up finding something else about __init__. see e.g. Creating a list in Python- something sneaky going on?

To clarify, what I am looking for is the inverse of the following list decomposition in Python found in this question:

head, *tail = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

This gives:

>>>head
1
>>> tail
[1, 2, 3, 5, 8, 13, 21, 34, 55]

I am looking for a list constructor e.g. cons or :: such that

head :: tail  =>   original list

Upvotes: 9

Views: 12837

Answers (4)

Salomé
Salomé

Reputation: 1978

To answer your question, there is no direct equivalent to the cons that you would usually find in what are called "functional" languages (lisp, OCaml, Haskell, etc.)

That's because there are two competing models to represent lists of elements in programming languages.

Linked lists

The one you seem to be familiar with is called a linked list.

A linked list is composed of cons-cells that each contain two references :

  • the first one toward an element of the list and is called head
  • the other towards the next cons-cell in the list and is the tail

Because lists are rarely infinite, the last const cell would usually point towards a special value, the empty list, sometimes called nil.

If you wanted to save the list in a variable for future reference, you would keep a reference towards the first cons-cell.

Here's a visual representation from Wikipedia.

In this model, every list is necessarily constructed by adding elements to the front by creating a new cons-cell, pointing to the new element as its head and to the previously constructed sublist as its tail. This is why the cons operator is sometimes called the list constructor.

Arrays

This is the model usually preferred by imperative languages such as Python. In this model, a list is just a reference to a range in memory.

Let's suppose you create a list just like so :

l = [1, 2, 3]

Whenever you create a list, Python will assign to it a small range of memory in which to store the elements, with a bit of extra space just in case you want to add elements later. To store it, you would simply store a reference to the first element, and to the size of the memory range, just like so :

l  <-- your variable
|     ___ ___ ___ ___ ___ ___ ___ ___ ___
|->  |   |   |   |   |   |   |   |   |   |
     | 1 | 2 | 3 |   |   |   |   |   |   |
     |___|___|___|___|___|___|___|___|___|

If you decide to add an element at the end of the list, you can use append

l.append(4)

Resulting in the following list :

 ___ ___ ___ ___ ___ ___ ___ ___ ___
|   |   |   |   |   |   |   |   |   |
| 1 | 2 | 3 | 4 |   |   |   |   |   |
|___|___|___|___|___|___|___|___|___|

Now, let's say you forgot an initial 0, and you now wish to add it to the front. You could really well use the insert method (with an insert position of 0) :

l.insert(0, 0)

But there's no space at the beginning of the list ! Python has no choice but to take each and every element, and copy them one at a time in the spot on the direct right :

 ___ ___ ___ ___ ___ ___ ___ ___ ___
|   |   |   |   |   |   |   |   |   |
| 1 | 2 | 3 | 4 |   |   |   |   |   |
|___|___|___|___|___|___|___|___|___|
  |   |   |__ |___
  |   |___   |    |  First, Python has to copy the four elements
  |___    |  |    |  one space to the right 
 ___ _\/ _\/ \/_ _\/ ___ ___ ___ ___
|   |   |   |   |   |   |   |   |   |
|   | 1 | 2 | 3 | 4 |   |   |   |   |
|___|___|___|___|___|___|___|___|___|

Only then can it insert the 0 at the beginning

 ___ ___ ___ ___ ___ ___ ___ ___ ___
|   |   |   |   |   |   |   |   |   |
| 0 | 1 | 2 | 3 |   |   |   |   |   |
|___|___|___|___|___|___|___|___|___|

It may not seem like much for such a small array, but imagine your array is much bigger, and you repeat this operation many times : you would spend so much time building your list !

That is why you won't find a list constructor in languages using arrays for their lists like Python.

Diving further : why the two different models?

You may now be wondering why different languages would prefer different list models, and if one of the two models is superior.

It's because these two data structures have different performance in different contexts. Two examples :

Accessing an element in the middle

Let's suppose you want to get the fifth element of the list.

In the linked list you would need to get :

  • the first cons cell
  • then the tail of this cell to get the second element
  • then the tail of this cell to get the third element
  • then the tail of this cell to get the fourth element
  • and finally the tail of this cell to get the fifth element

You would thus have to go through 5 references !

With an array, that's much simpler : you know the reference of the first element. You just need to access the reference 4 spots on the right, given the elements are all in a contiguous memory range !

If you need to access random elements of a very large list a great many times, arrays are thus much better.

Inserting an element in the middle

Imagine you now want to insert an element in the middle.

With a linked list :

  • you locate the cons cell corresponding to the last element before the insertion point.
  • you create a new cons cell, with a head pointing to the element you want to add and the same tail as the cons cell you just located.
  • you can now change the tail of this cell to point to the newly created cell.

With an array, just as when adding an element in the middle, you will need to copy every element to the right of the insertion point one space to the right !

In this situation, that's the linked list that is clearly superior.

Upvotes: 15

mgilson
mgilson

Reputation: 310049

Some are suggesting that you do:

a = 1
b = [2, 3, 4, 5]
c = [a] + b

Which will work provided that b is of type list. However, often you'll find yourself working with an object that is either a list or a tuple or (insert your favorite iterable object here). In that case, it might be more worth your while to do:

a = 1
b = (2, 3, 4, 5)
c = [a]
c.extend(b)

This makes the whole thing agnostic to the type of b (this allows your code to be more "ducky", which can be kind of nice). . .


Of course, there are other options as well. . . For example, You could choose to reach for itertools:

import itertools
a = 1
b = [2, 3, 4, 5]
lazy_c = itertools.chain([a], b)

Again we have the benefit of not caring what type of iterable b is, and we pick up a side benefit of iterating lazily -- We won't make a new list or tuple. We'll just make an object that will yield the same thing when you're iterating over it. This can save memory (and sometimes cycles for your CPU), but can also have un-intended consequences if b is also a sort of generator-like object that is being iterated over elsewhere at the same time (which doesn't happen by accident very often).

Upvotes: 7

hobbs
hobbs

Reputation: 240254

Python lists are mutable, vs. OCaml's, which have immutable value semantics, but if a is an item and b is a list, you can get the result you want using [a] + b which will return a new list containing a followed by the items in b, and not modify either of a or b. A more common pattern for building up a list, though, would be to append or extend it in-place, though.

Upvotes: 6

ebracho
ebracho

Reputation: 11

You can use list concatenation

head = 1
tail = [2,3,4]
lst = [head] + tail # [1,2,3,4]

Upvotes: 1

Related Questions