Understanding this assignment?

In order to refresh my 20 year old experience with Haskell I am walking through https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Adding_Variables_and_Assignment and at one point the following line is introduced to apply op to all the params. This is to implement e.g. (+ 1 2 3 4)

numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op

I do not understand the syntax, and the explanation in the text is a bit vague.

I understand what foldl1 does and how to dot functions (unpackNum is a helper function), but using Monads and the >>= operator leaves me a bit confused.

How is this to be read?

Upvotes: 9

Views: 217

Answers (5)

jberryman
jberryman

Reputation: 16635

Your question is about syntax, so I'll just talk about how to parse that expression. Haskell's syntax is pretty simple. Informally:

  • identifiers separated by spaces are function application (the first identifier applied to the rest)
  • except identifiers that use non-alphanumeric chatracters (e.g. >>=, or .) are infix (i.e. their first argument is to the left of the identifier)
  • the first type of function application above (non-infix) binds more tightly than the second
  • operators can associate either to the left or right, and have different precedence (defined with an infix... declaration)

So only knowing this, if I see:

mapM unpackNum params >>= return . Number . foldl1 op

To begin with I know that it must be parse like

(mapM unpackNum params) >>= return . Number . (foldl1 op)

To go further we need to inspect the fixity/precedence of the two operators we see in this expression:

Prelude> :info (.)
(.) :: (b -> c) -> (a -> b) -> a -> c   -- Defined in ‘GHC.Base’
infixr 9 .
Prelude> :info (>>=)
class Applicative m => Monad (m :: * -> *) where
  (>>=) :: m a -> (a -> m b) -> m b
  ...
    -- Defined in ‘GHC.Base’
infixl 1 >>=

(.) has a higher precedence (9 vs 1 for >>=), so its arguments will bind more tightly (i.e. we parenthesize them first). But how do we know which of these is correct?

(mapM unpackNum params) >>= ((return . Number) . (foldl1 op))
(mapM unpackNum params) >>= (return . (Number . (foldl1 op)))

...? Because (.) was declared infixr it associates to the right, meaning the second parse above is correct.

As Will Ness points out in the comments, (.) is associative (like e.g. addition) so both of these happen to be semantically equivalent.

With a little experience with a library (or the Prelude in this case) you learn to parse expressions with operators correctly without thinking too much.

If after doing this exercise you want to understand what a function does or how it works, then you can click through to the source of the functions you're interested in and replace occurrences of left-hand sides with right-hand sides (i.e. inline the bodies of the functions and terms). Obviously you can do this in your head or in an editor.

Upvotes: 5

chi
chi

Reputation: 116139

Essentially,

mapM unpackNum params >>= return . Number . foldl1 op

is made of two parts.

mapM unpackNum params means: take the list of parameters params. On each item, apply unpackNum: this will produce an Integer wrapped inside the ThrowsError monad. So, it's not exactly a plain Integer, since it has a chance to error out. Anyway, using unpackNum on each item either successfully produces all Integers, or throws some error. In the first case, we produce a new list of type [Integer], in the second one we (unsurprisingly) throw the error. So, the resulting type for this part is ThrowsError [Integer].

The second part is ... >>= return . Number . foldl1 op. Here >>= means: if the first part threw some error, the whole expression throws that error as well. If the part succeeded in producing [Integer] then proceed with foldl1 op, wrap the result as a Number, and finally use return to inject this value as a successful computation.

Overall there are monadic computations, but you should not think about those too much. The monadic stuff here is only propagating the errors around, or store plain values if the computation is successful. With a bit of experience, one can concentrate on the successful values only, and let mapM,>>=,return handle the error cases.

By the way, note that while the book uses code like action >>= return . f, this is arguably a bad style. One can use, to the same effect, fmap f action or f <$> action, which is a more direct way to express the same computation. E.g.

Number . foldl1 op <$> mapM unpackNum params

which is very close to a non-monadic code which ignores the error cases

-- this would work if there was no monad around, and errors did not have to be handled
Number . foldl1 op $ map unpackNum params

Upvotes: 8

Will Ness
Will Ness

Reputation: 71065

First, syntax. Whitespace is application, semantically:

f x = f $ x  -- "call" f with an argument x

so your expression is actually

numericBinop op params = ((mapM unpackNum) params) >>= return . Number . (foldl1 op)

Next, the operators are built from non-alphanumerical characters, without any whitespace. Here, there's . and >>=. Running :i (.) and :i (>>=) at GHCi reveals their fixity specs are infixl 9 . and infixr 1 >>=. 9 is above 1 so . is stronger than >>=; thus

                       = ((mapM unpackNum) params) >>= (return . Number . (foldl1 op))

infixl 9 . means . associates to the right, thus, finally, it is

                       = ((mapM unpackNum) params) >>= (return . (Number . (foldl1 op)))

The (.) is defined as (f . g) x = f (g x), thus (f . (g . h)) x = f ((g . h) x) = f (g (h x)) = (f . g) (h x) = ((f . g) . h) x; by eta-reduction we have

(f . (g . h)) = ((f . g) . h) 

thus (.) is associative, and so parenthesization is optional. We'll drop the explicit parens with the "whitespace" application from now on as well. Thus we have

numericBinop op params = (mapM unpackNum params)  >>= 
          (\ x -> return (Number (foldl1 op x)))   -- '\' is for '/\'ambda

Monadic sequences are easier written with do, and the above is equivalent to

      = do 
          {  x <- mapM unpackNum params        -- { ; } are optional, IF all 'do'
          ;  return (Number (foldl1 op x)))    --   lines are indented at the same level
          }

Next, mapM can be defined as

    mapM f [] = return []
    mapM f (x:xs) = do { x  <- f x ;
                         xs <- mapM f xs ;
                         return (x : xs) }

and the Monad Laws demand that

      do { r <- do { x ;       ===      do { x ;
                     y } ;                   r <- y ;  
           foo r                             foo r 
         }                                 }

(you can find an overview of do notation e.g. in this recent answer of mine); thus,

numericBinop op [a, b, ..., z] =
   do {
         a <- unpackNum a ;
         b <- unpackNum b ;
         ...........
         z <- unpackNum z ;
         return (Number (foldl1 op [a, b, ..., z]))
      }

(you may have noticed my use of x <- x bindings -- we can use the same variable name on both sides of <-, because monadic bindings are not recursive -- thus introducing shadowing.)

This is now clearer, hopefully.

But, I said "first, syntax". So now, the meaning of it. By same Monad Laws,

numericBinop op [a, b, ..., y, z] =
   do {
         xs <- do { a <- unpackNum a ;
                    b <- unpackNum b ;
                    ...........
                    y <- unpackNum y ;
                    return [a, b, ..., y] } ;
         z <- unpackNum z ;
         return (Number (op (foldl1 op xs) z))
      }

thus, we need only understand the sequencing of two "computations", c and d,

do { a <- c ; b <- d ; return (foo a b) }
=
          c >>= (\ a ->
                   d >>= (\ b ->     
                       return (foo a b) ))

for a particular monad involved, which is determined by the bind (>>=) operator's implementation for a given monad.

Monads are EDSLs for generalized function composition. The sequencing of computations involves not only the explicit expressions appearing in the do sequence, but also the implicit effects peculiar to the particular monad in question, performed in principled and consistent manner behind the scenes. Which is the whole point to having monads in the first place (well, one of the main points, at least).

Here the monad involved appears to concern itself with the possibility of failure, and early bail-outs in the event that failure indeed happens.

So, with the do code we write the essence of what we intend to happen, and the possibility of intermittent failure is automatically taken care of, for us, behind the scenes.

In other words, if one of unpackNum computations fails, so will the whole of the combined computation fail, without attempting any of the subsequent unpackNum sub-computations. But if all of them succeed, so will the combined computation.

Upvotes: 1

Welperooni
Welperooni

Reputation: 643

You could "sugar this up" with a more beginner-friendly syntax, with the do notation. Your function, numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op would become:

numericBinop op params = do
    x <- mapM unpackNum params -- "<-" translates to ">>=", the bind operator
    return . Number $ foldl1 op x

Well now the most mysterious is the mapM function, that is sequence . fmap, and it simply takes a function, fmaps it over the container, and flips the type, in this case (I presume) from [Number Integer] to ThrowsError [Integer], while preserving any errors (side effects) that may arise during the flipping, or in other words, if the 'flipping' caused any error, it would be represented in the result.

Not the simplest example, and you probably would be better off seeing how mapM (fmap (+1)) [Just 2, Just 3] differs from mapM (fmap (+1)) [Just 2, Nothing]. For more insights look into Traversable typeclass.

After that, you bind the [Integer] out of the ThrowsError monad and feed it to the function that takes care of doing the computation on the list, resulting in a single Integer, which in turn you need to re-embed into the ThrowsError monad with the return function after you wrap it into a Number.

If you still have trouble understanding monads, I suggest you take a look at the still relevant LYAH chapter that will gently introduce you to monads

Upvotes: 4

bipll
bipll

Reputation: 11940

>>= builds a computation that may fail at either end: its left argument can be an empty monad, in which case it does not even happen, otherwise its result can be empty as well. It has type

>>= :: m a -> (a -> m b) -> m b

See, its arguments are: value(s) immersed into monad and a function that accepts a pure value and returns an immersed result. This operator is a monadic version of what is known as flatMap in Scala, for instance; in Haskell, its particular implementation for lists is known as concatMap. If you have a list l, then l >>= f works as follows: for each element of l, f is applied to this element and returns a list; and all those resultant lists are concatenated to produce result.

Consider a code in Java:

try {
    function1();
    function2();
}
catch(Exception e) {
}

What happens when function2 is called? See, after the call to function1 the program is probably in a valid state, so function2() is an operator that transforms this current state into some different next one. But the call to function1() could result in an exception thrown, so the control would immediately transfer to the catch-block—this can be regarded as null state, so there's nothing to apply function2 to. In other words, we have the following possible control paths:

[S0]  --- function1() -->  [S1]  --- function2() -->  [S2]
[S0]  --- function1() -->  []    --> catch

(For simplicity, exceptions thrown from function2 are not considered in this diagram.)

So, either [S1] is a (non-empty) valid machine state, and function2 transforms it further to a (non-empty) valid [S2], or it is empty, and thus function2() is a no-op and never run. This can be summarized in pseudo-code as

S2 <- [S0] >>= function1 >>= function2

Upvotes: 2

Related Questions