kek_mek
kek_mek

Reputation: 125

Why does the position of arguments matter in cons?

Simple code:

> (cons null (cons 1 2))
'(() 1 . 2)
> (cons (cons 1 2)  null)
'((1 . 2))

Initially, I'd expect the result to be the same. I can think of some vague explanations, but would also like to hear a strong point from someone knowledgeable.

Why is the result different?

Upvotes: 4

Views: 193

Answers (2)

ad absurdum
ad absurdum

Reputation: 21361

In the parlance of mathematics, some operations are commutative. Addition is commutative, so (+ 1 2) and (+ 2 1) both have the same result. Division is not commutative; (/ 1 2) and (/ 2 1) do not have the same result.

The OP expectation that (cons null (cons 1 2)) and (cons null (cons 1 2)) should have the same result is in fact the expectation that cons is a commutative procedure. It is not. If cons were commutative, then (cons 1 2) and (cons 2 1) would be equivalent. But:

> (cons 1 2)
(1 . 2)
> (cons 2 1)
(2 . 1)

The cons procedure creates a cons cell, which has two components traditionally called the car and the cdr; it creates the cons cell from two arguments. The first argument is placed in the car of the cell, and the second argument is placed in the cdr of the cell. With (cons 1 2), 1 is placed in the car, and 2 is placed in the cdr; but with (cons 2 1) the 2 is placed in the car, while the 1 is placed in the cdr. You can see that (cons 1 2) and (cons 2 1) must necessarily be different since the car and cdr of the cons cell are distinct.

OP example (cons null (cons 1 2)) places the empty list in the car of a cons cell, and places the cell (1 . 2) (itself created by another call to cons) in the cdr of that cell. This is the dotted list (described below) (() . (1 . 2)); such a dotted list is usually represented in the REPL as (() 1 . 2).

But with (cons (cons 1 2) null) the cell (1 . 2) is instead placed in the car of a cons cell, with the empty list place in the cdr of that cell: ((1 . 2) . ()).

Now, in Lisps a list is a chain of cons cells terminating with the empty list. A chain like (1 . (2 . 3)) is sometimes called a dotted list or an improper list; this is a chain of cons cells that does not terminate with the empty list. In the REPL you will often see (1 . (2 . 3)) represented as (1 2 . 3). On the other hand, a chain like (1 . (2 . ())) is then called a proper list; this is a chain of cons cells that does terminate with the empty list. You would see this represented in the REPL as (1 2). Usually when someone uses the unqualified term list, they mean proper list.

So, OP code ((1 . 2) . ()) is equivalent to the proper list ((1 . 2)), which contains the single member (1 . 2), and this is certainly different from the dotted list (() 1 . 2).

There is another procedure, append, which would behave as OP expected. Both (append null (append '(1) '(2))) and (append (append '(1) '(2)) null) evaluate to '(1 2). These last two expressions evaluate to the same result because the empty list is the identity element for the operation append. That is, joining a nonempty list of items with the empty list gives back a copy of the nonempty list. Similarly, joining an empty list with a nonempty list gives back a copy of the nonempty list. But append is not commutative; append returns a list containing the elements of the first list followed by the elements of the second list. So, (append '(1 2) '(3 4)) --> '(1 2 3 4), but (append '(3 4) '(1 2)) --> '(3 4 1 2).

There is another relevant mathematical property: associativity. Addition is associative, i.e., (+ 1 (+ 2 3)) and (+ (+ 1 2) 3) both evaluate to the same result. Using infix notation, 1 + (2 + 3) and (1 + 2) + 3 both evaluate to the same result. This has an interesting consequence. Since the grouping is not important, it makes sense to simply write the above infix expressions as 1 + 2 + 3. The corresponding prefix expression is (+ 1 2 3), and Lisps do support this way of expressing addition for multiple operands.

cons is not associative. Since cons creates a pair from its arguments, it can't be associative. With (cons 1 2) the pair (1 . 2) is created. With (cons 1 (cons 2 3)) the pair (2 . 3) is placed in the cdr of another pair, and the result is (1 . (2 . 3)) (probably represented in the REPL as (1 2 . 3)). But with (cons (cons 1 2) 3) the pair (1 . 2) is created and placed in the car of another pair, resulting in ((1 . 2) . 3). Note that, since cons is not associative, an expression like (cons 1 2 3) does not make sense; the grouping of operations is unclear.

But append is associative. With (append '(1) '(2)) the list (1 2) is created. With (append '(1) (append '(2) '(3))) the list (2 3) is created, and the list (1) is then joined with (2 3), resulting in (1 2 3). With (append (append '(1) '(2)) '(3)) the list (1 2) is created and then joined with the list (3), also resulting in the list (1 2 3). Since grouping is not important here, it makes sense to simply write (append '(1) '(2) '(3)), and again, as with addition, Lisps do support this way of expressing append operations.

Upvotes: 7

Atharva Shukla
Atharva Shukla

Reputation: 2137

See the "box and pointer" diagram in SICP here. The cons is a constructor of pairs - it just puts two things together. If we generalize this notion of pairing things, we can build trees. In scheme/racket, lists are just a special case of trees (with each "left" branch holding an element of the list and each right branch holding the rest of the list).

The shorthand: Since it's cumbersome writing (cons 1 (cons 2 (cons 3 null))), we simplify: '(1 2 3). Note that the last null (or '()) is omitted in the shorthand. If it weren't a null and it were a 4, we would get the following shorthand version: '(1 2 3 . 4). The dot suggests that the right hand side of the construction is not a list.

For the first example, null is just an element of the list - occurring on a left branch. For the second example, null is the rightmost element - suggesting the end of the list.

    / \          Null is displayed as () in
   /   \         the shorthand version. Whenever
 null            a pair does not have a list
       / \       on the right, we see a dot.
      /   \ 
     1     2


       / \       Note here that the null
      /   \      is on the last right-most
          null   branch. This null is not
    / \          shown in the shorthand.
   /   \
  1     2

Upvotes: 4

Related Questions