Why does the position of arguments matter in cons?

Question

Simple code:

> (cons null (cons 1 2))
'(() 1 . 2)
> (cons (cons 1 2)  null)
'((1 . 2))

Initially, I'd expect the result to be the same. I can think of some vague explanations, but would also like to hear a strong point from someone knowledgeable.

Why is the result different?

ad absurdum · Accepted Answer

In the parlance of mathematics, some operations are commutative. Addition is commutative, so (+ 1 2) and (+ 2 1) both have the same result. Division is not commutative; (/ 1 2) and (/ 2 1) do not have the same result.

The OP expectation that (cons null (cons 1 2)) and (cons null (cons 1 2)) should have the same result is in fact the expectation that cons is a commutative procedure. It is not. If cons were commutative, then (cons 1 2) and (cons 2 1) would be equivalent. But:

> (cons 1 2)
(1 . 2)
> (cons 2 1)
(2 . 1)

The cons procedure creates a cons cell, which has two components traditionally called the car and the cdr; it creates the cons cell from two arguments. The first argument is placed in the car of the cell, and the second argument is placed in the cdr of the cell. With (cons 1 2), 1 is placed in the car, and 2 is placed in the cdr; but with (cons 2 1) the 2 is placed in the car, while the 1 is placed in the cdr. You can see that (cons 1 2) and (cons 2 1) must necessarily be different since the car and cdr of the cons cell are distinct.

OP example (cons null (cons 1 2)) places the empty list in the car of a cons cell, and places the cell (1 . 2) (itself created by another call to cons) in the cdr of that cell. This is the dotted list (described below) (() . (1 . 2)); such a dotted list is usually represented in the REPL as (() 1 . 2).

But with (cons (cons 1 2) null) the cell (1 . 2) is instead placed in the car of a cons cell, with the empty list place in the cdr of that cell: ((1 . 2) . ()).

Now, in Lisps a list is a chain of cons cells terminating with the empty list. A chain like (1 . (2 . 3)) is sometimes called a dotted list or an improper list; this is a chain of cons cells that does not terminate with the empty list. In the REPL you will often see (1 . (2 . 3)) represented as (1 2 . 3). On the other hand, a chain like (1 . (2 . ())) is then called a proper list; this is a chain of cons cells that does terminate with the empty list. You would see this represented in the REPL as (1 2). Usually when someone uses the unqualified term list, they mean proper list.

So, OP code ((1 . 2) . ()) is equivalent to the proper list ((1 . 2)), which contains the single member (1 . 2), and this is certainly different from the dotted list (() 1 . 2).

There is another procedure, append, which would behave as OP expected. Both (append null (append '(1) '(2))) and (append (append '(1) '(2)) null) evaluate to '(1 2). These last two expressions evaluate to the same result because the empty list is the identity element for the operation append. That is, joining a nonempty list of items with the empty list gives back a copy of the nonempty list. Similarly, joining an empty list with a nonempty list gives back a copy of the nonempty list. But append is not commutative; append returns a list containing the elements of the first list followed by the elements of the second list. So, (append '(1 2) '(3 4)) --> '(1 2 3 4), but (append '(3 4) '(1 2)) --> '(3 4 1 2).

There is another relevant mathematical property: associativity. Addition is associative, i.e., (+ 1 (+ 2 3)) and (+ (+ 1 2) 3) both evaluate to the same result. Using infix notation, 1 + (2 + 3) and (1 + 2) + 3 both evaluate to the same result. This has an interesting consequence. Since the grouping is not important, it makes sense to simply write the above infix expressions as 1 + 2 + 3. The corresponding prefix expression is (+ 1 2 3), and Lisps do support this way of expressing addition for multiple operands.

cons is not associative. Since cons creates a pair from its arguments, it can't be associative. With (cons 1 2) the pair (1 . 2) is created. With (cons 1 (cons 2 3)) the pair (2 . 3) is placed in the cdr of another pair, and the result is (1 . (2 . 3)) (probably represented in the REPL as (1 2 . 3)). But with (cons (cons 1 2) 3) the pair (1 . 2) is created and placed in the car of another pair, resulting in ((1 . 2) . 3). Note that, since cons is not associative, an expression like (cons 1 2 3) does not make sense; the grouping of operations is unclear.

But append is associative. With (append '(1) '(2)) the list (1 2) is created. With (append '(1) (append '(2) '(3))) the list (2 3) is created, and the list (1) is then joined with (2 3), resulting in (1 2 3). With (append (append '(1) '(2)) '(3)) the list (1 2) is created and then joined with the list (3), also resulting in the list (1 2 3). Since grouping is not important here, it makes sense to simply write (append '(1) '(2) '(3)), and again, as with addition, Lisps do support this way of expressing append operations.

Why does the position of arguments matter in cons?

Answers (2)

Related Questions