Reputation: 51
I'm working through a class in Haskell and we are using list comprehensions for equivalence relations, which have been somewhat confusing. For example I can't wrap my head around what exactly is happening in this case.
exmp2 :: Reln -> [(Int, Int)]
exmp2 rs = [ (j, i) | (i,j) <- rs, (j,i) <- rs ]
vs
exmpl1 :: Reln -> [(Int, Int)]
exmpl1 rs = [ (j, i) | (i,j) <- rs ]
If I use the above...
exmpl1 [(1, 2), (2, 7), (1,9)]
outputs
[(2, 1), (7,2), (9,1)]
exmpl2 [(1, 2), (2, 7), (1,9)]
outputs
[(1,2), (2,7), (1,9), (1,2), (2,7), (1,9), (1,2), (2,7), (1,9)]
I've gone through list comprehensions tutorials etc but I can't for the life of me figure out why exactly exmpl2
outputs what is does. Thanks
Upvotes: 1
Views: 207
Reputation: 43309
The list-comprehension syntax can also equivalently be expressed using the "do" notation, which is less cryptic and easier to explain.
E.g., the following comprehension:
[ (j, i) | (i,j) <- rs, (j,i) <- rs ]
is equivalent to the following expression:
do
(i, j) <- rs
(j, i) <- rs
return (j, i)
So what happens here is that you first extract a value from the rs
monad (in this case, List) and desctructure and assign it to the i
and j
variables. Then you extract the value again and assign it to the same variables in the reverse order. Thus you shadow the first assignment. After that you just produce the value for the output monad (List).
Since the first assignment is shadowed, we can simply bypass it, getting the following equivalent expression:
do
_ <- rs
(j, i) <- rs
return (j, i)
or even:
do
rs
(j, i) <- rs
return (j, i)
The List monad is defined such that you walk through all the possible combinations of all the extractions you do there.
In this case you do extraction twice. As the result you get double traversal of the rs
input list.
In an imperative language it would look something like this:
function exmpl2(rs) {
var results = [];
for (a in rs) {
for (b in rs) {
results.push(b);
}
}
return results;
}
Upvotes: 2
Reputation: 71400
I feel this can be more easily understood by approaching it from a simpler example.
λ let chars = ['a', 'b', 'c']
λ let bools = [True, False]
λ [(c, b) | c <- chars, b <- bools]
[('a',True),('a',False),('b',True),('b',False),('c',True),('c',False)]
It should be clear that we've produced every possible combination of taking a single c
from chars
, and a single b
from bools
.
The next step is to understand that I could have used any expression where I put (c, b)
. That part has zero impact on the number of items in the resulting list; whatever I put there I'll still have one item for every combination of c
and b
from chars
and bools
. I don't even need to actually use c
and b
(though obviously you usually would in real code).
As an example:
λ [ 7 | c <- chars, b <- bools]
[7,7,7,7,7,7]
For every combination of c
in chars
and b
in bools
, I produce 7
. So that's six 7
items in my list.
I can also refer to just one of c
and b
, and it still doesn't change the number of items I put out:
λ [ b | c <- chars, b <- bools ]
[True,False,True,False,True,False]
The only values I'm using are the True
and False
from bools
, but the shape of the list (having six items) is determined by the "every combination of chars
and bools
" logic.
Now a slightly evil trick I can pull (that is being used in your exmp2
) is that because I never use c
it doesn't matter what variable name I use to extract from chars
. And because I'm later binding b
when I extract from bools
, I could use b
there. The second binding of b
in b <- bools
shadows the first; it's in effect a totally separate variable, but because they have the same name I can now only refer to one of them. Since I'm only using one of them, this doesn't affect what I'm trying to do (but it does make the code more difficult to understand, since you have to know more to figure out which b
binding is the one affecting the values that get put out into my list):
λ [ b | b <- chars, b <- bools ]
[True,False,True,False,True,False]
So now we can look at your example:
exmp2 :: Reln -> [(Int, Int)]
exmp2 rs = [ (j, i) | (i,j) <- rs, (j,i) <- rs ]
This is taking every combination of (i, j)
from rs
and (j, i)
from rs
. Ignore the i
and j
for a moment; we're taking every combination of things we get by choosing two things from the same list. So we'll end up with the square of the number of items in rs
in the output list.
The first (i, j) <- rs
case binds the variables i
and j
. But the second case just binds the same variables again, so it's totally irrelevant what they were. We can just read this as if it was:
exmp2 rs = [ (j, i) | _ <- rs, (j,i) <- rs ]
So, for every possible combination of one pair from rs
we don't care about, and another pair (j, i)
from rs
, produce the pair (j, i)
.
That's why in exmpl2 [(1, 2), (2, 7), (1,9)]
you get the output containing 3 copies of the sequence (1, 2), (2, 7), (1,9)
; once for the first pair (1, 2)
, once for the second pair (2, 7)
, and once for the third pair (1, 9)
.
Upvotes: 3