Reputation: 75356
In order to refresh my 20 year old experience with Haskell I am walking through https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Adding_Variables_and_Assignment and at one point the following line is introduced to apply op
to all the params. This is to implement e.g. (+ 1 2 3 4)
numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op
I do not understand the syntax, and the explanation in the text is a bit vague.
I understand what foldl1
does and how to dot functions (unpackNum
is a helper function), but using Monads and the >>=
operator leaves me a bit confused.
How is this to be read?
Upvotes: 9
Views: 217
Reputation: 16635
Your question is about syntax, so I'll just talk about how to parse that expression. Haskell's syntax is pretty simple. Informally:
>>=
, or .
) are infix (i.e. their first argument is to the left of the identifier)infix...
declaration)So only knowing this, if I see:
mapM unpackNum params >>= return . Number . foldl1 op
To begin with I know that it must be parse like
(mapM unpackNum params) >>= return . Number . (foldl1 op)
To go further we need to inspect the fixity/precedence of the two operators we see in this expression:
Prelude> :info (.)
(.) :: (b -> c) -> (a -> b) -> a -> c -- Defined in ‘GHC.Base’
infixr 9 .
Prelude> :info (>>=)
class Applicative m => Monad (m :: * -> *) where
(>>=) :: m a -> (a -> m b) -> m b
...
-- Defined in ‘GHC.Base’
infixl 1 >>=
(.)
has a higher precedence (9
vs 1
for >>=
), so its arguments will bind more tightly (i.e. we parenthesize them first). But how do we know which of these is correct?
(mapM unpackNum params) >>= ((return . Number) . (foldl1 op))
(mapM unpackNum params) >>= (return . (Number . (foldl1 op)))
...? Because (.)
was declared infixr
it associates to the right, meaning the second parse above is correct.
As Will Ness points out in the comments, (.)
is associative (like e.g. addition) so both of these happen to be semantically equivalent.
With a little experience with a library (or the Prelude
in this case) you learn to parse expressions with operators correctly without thinking too much.
If after doing this exercise you want to understand what a function does or how it works, then you can click through to the source of the functions you're interested in and replace occurrences of left-hand sides with right-hand sides (i.e. inline the bodies of the functions and terms). Obviously you can do this in your head or in an editor.
Upvotes: 5
Reputation: 116139
Essentially,
mapM unpackNum params >>= return . Number . foldl1 op
is made of two parts.
mapM unpackNum params
means: take the list of parameters params
. On each item, apply unpackNum
: this will produce an Integer
wrapped inside the ThrowsError
monad. So, it's not exactly a plain Integer
, since it has a chance to error out. Anyway, using unpackNum
on each item either successfully produces all Integers
, or throws some error. In the first case, we produce a new list of type [Integer]
, in the second one we (unsurprisingly) throw the error. So, the resulting type for this part is ThrowsError [Integer]
.
The second part is ... >>= return . Number . foldl1 op
. Here >>=
means: if the first part threw some error, the whole expression throws that error as well. If the part succeeded in producing [Integer]
then proceed with foldl1 op
, wrap the result as a Number
, and finally use return
to inject this value as a successful computation.
Overall there are monadic computations, but you should not think about those too much. The monadic stuff here is only propagating the errors around, or store plain values if the computation is successful. With a bit of experience, one can concentrate on the successful values only, and let mapM,>>=,return
handle the error cases.
By the way, note that while the book uses code like action >>= return . f
, this is arguably a bad style. One can use, to the same effect, fmap f action
or f <$> action
, which is a more direct way to express the same computation. E.g.
Number . foldl1 op <$> mapM unpackNum params
which is very close to a non-monadic code which ignores the error cases
-- this would work if there was no monad around, and errors did not have to be handled
Number . foldl1 op $ map unpackNum params
Upvotes: 8
Reputation: 71065
First, syntax. Whitespace is application, semantically:
f x = f $ x -- "call" f with an argument x
so your expression is actually
numericBinop op params = ((mapM unpackNum) params) >>= return . Number . (foldl1 op)
Next, the operators are built from non-alphanumerical characters, without any whitespace. Here, there's .
and >>=
. Running :i (.)
and :i (>>=)
at GHCi reveals their fixity specs are infixl 9 .
and infixr 1 >>=
. 9
is above 1
so .
is stronger than >>=
; thus
= ((mapM unpackNum) params) >>= (return . Number . (foldl1 op))
infixl 9 .
means .
associates to the right, thus, finally, it is
= ((mapM unpackNum) params) >>= (return . (Number . (foldl1 op)))
The (.)
is defined as (f . g) x = f (g x)
, thus (f . (g . h)) x = f ((g . h) x) = f (g (h x)) = (f . g) (h x) = ((f . g) . h) x
; by eta-reduction we have
(f . (g . h)) = ((f . g) . h)
thus (.)
is associative, and so parenthesization is optional. We'll drop the explicit parens with the "whitespace" application from now on as well. Thus we have
numericBinop op params = (mapM unpackNum params) >>=
(\ x -> return (Number (foldl1 op x))) -- '\' is for '/\'ambda
Monadic sequences are easier written with do
, and the above is equivalent to
= do
{ x <- mapM unpackNum params -- { ; } are optional, IF all 'do'
; return (Number (foldl1 op x))) -- lines are indented at the same level
}
Next, mapM
can be defined as
mapM f [] = return []
mapM f (x:xs) = do { x <- f x ;
xs <- mapM f xs ;
return (x : xs) }
and the Monad Laws demand that
do { r <- do { x ; === do { x ;
y } ; r <- y ;
foo r foo r
} }
(you can find an overview of do
notation e.g. in this recent answer of mine); thus,
numericBinop op [a, b, ..., z] =
do {
a <- unpackNum a ;
b <- unpackNum b ;
...........
z <- unpackNum z ;
return (Number (foldl1 op [a, b, ..., z]))
}
(you may have noticed my use of x <- x
bindings -- we can use the same variable name on both sides of <-
, because monadic bindings are not recursive -- thus introducing shadowing.)
This is now clearer, hopefully.
But, I said "first, syntax". So now, the meaning of it. By same Monad Laws,
numericBinop op [a, b, ..., y, z] =
do {
xs <- do { a <- unpackNum a ;
b <- unpackNum b ;
...........
y <- unpackNum y ;
return [a, b, ..., y] } ;
z <- unpackNum z ;
return (Number (op (foldl1 op xs) z))
}
thus, we need only understand the sequencing of two "computations", c
and d
,
do { a <- c ; b <- d ; return (foo a b) }
=
c >>= (\ a ->
d >>= (\ b ->
return (foo a b) ))
for a particular monad involved, which is determined by the bind (>>=
) operator's implementation for a given monad.
Monads are EDSLs for generalized function composition. The sequencing of computations involves not only the explicit expressions appearing in the do
sequence, but also the implicit effects peculiar to the particular monad in question, performed in principled and consistent manner behind the scenes. Which is the whole point to having monads in the first place (well, one of the main points, at least).
Here the monad involved appears to concern itself with the possibility of failure, and early bail-outs in the event that failure indeed happens.
So, with the do
code we write the essence of what we intend to happen, and the possibility of intermittent failure is automatically taken care of, for us, behind the scenes.
In other words, if one of unpackNum
computations fails, so will the whole of the combined computation fail, without attempting any of the subsequent unpackNum
sub-computations. But if all of them succeed, so will the combined computation.
Upvotes: 1
Reputation: 643
You could "sugar this up" with a more beginner-friendly syntax, with the do notation. Your function, numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op
would become:
numericBinop op params = do
x <- mapM unpackNum params -- "<-" translates to ">>=", the bind operator
return . Number $ foldl1 op x
Well now the most mysterious is the mapM
function, that is sequence . fmap
, and it simply takes a function, fmaps it over the container, and flips the type, in this case (I presume) from [Number Integer]
to ThrowsError [Integer]
, while preserving any errors (side effects) that may arise during the flipping, or in other words, if the 'flipping' caused any error, it would be represented in the result.
Not the simplest example, and you probably would be better off seeing how mapM (fmap (+1)) [Just 2, Just 3]
differs from mapM (fmap (+1)) [Just 2, Nothing]
. For more insights look into Traversable
typeclass.
After that, you bind the [Integer]
out of the ThrowsError
monad and feed it to the function that takes care of doing the computation on the list, resulting in a single Integer
, which in turn you need to re-embed into the ThrowsError
monad with the return
function after you wrap it into a Number
.
If you still have trouble understanding monads, I suggest you take a look at the still relevant LYAH chapter that will gently introduce you to monads
Upvotes: 4
Reputation: 11940
>>=
builds a computation that may fail at either end: its left argument can be an empty monad, in which case it does not even happen, otherwise its result can be empty as well. It has type
>>= :: m a -> (a -> m b) -> m b
See, its arguments are: value(s) immersed into monad and a function that accepts a pure value and returns an immersed result. This operator is a monadic version of what is known as flatMap
in Scala, for instance; in Haskell, its particular implementation for lists is known as concatMap
. If you have a list l
, then l >>= f
works as follows: for each element of l
, f
is applied to this element and returns a list; and all those resultant lists are concatenated to produce result.
Consider a code in Java:
try {
function1();
function2();
}
catch(Exception e) {
}
What happens when function2
is called? See, after the call to function1
the program is probably in a valid state, so function2()
is an operator that transforms this current state into some different next one. But the call to function1()
could result in an exception thrown, so the control would immediately transfer to the catch
-block—this can be regarded as null state, so there's nothing to apply function2
to. In other words, we have the following possible control paths:
[S0] --- function1() --> [S1] --- function2() --> [S2]
[S0] --- function1() --> [] --> catch
(For simplicity, exceptions thrown from function2
are not considered in this diagram.)
So, either [S1]
is a (non-empty) valid machine state, and function2
transforms it further to a (non-empty) valid [S2]
, or it is empty, and thus function2()
is a no-op and never run. This can be summarized in pseudo-code as
S2 <- [S0] >>= function1 >>= function2
Upvotes: 2