Bradley Nowacki
Bradley Nowacki

Reputation: 261

Monoids and Num in Haskell

I have been learning Haskell over the past few months and I came across an example of Monoids that has me puzzled.

Given these definitions:

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq) 

instance F.Foldable Tree where  
    foldMap f Empty = mempty  
    foldMap f (Node x l r) = F.foldMap f l `mappend`  
                             f x           `mappend`  
                             F.foldMap f r  

And this Tree:

testTree = Node 5  
        (Node 3  
            (Node 1 Empty Empty)  
            (Node 6 Empty Empty)  
        )  
        (Node 9  
            (Node 8 Empty Empty)  
            (Node 10 Empty Empty)  
        )  

If I run:

ghci> F.foldl (+) 0 testTree  
42  
ghci> F.foldl (*) 1 testTree  
64800  

How does GHCi know what Monoid to use for the mappend when it's folding? Because by default the numbers in the tree are just of the type Num, and we never explicitly said they where of some Monoid such as Sum or Product.

So how does GHCi infer the correct Monoid to use? Or am I totally off at this point?

Example source: http://learnyouahaskell.com/functors-applicative-functors-and-monoids#monoids

Upvotes: 18

Views: 1488

Answers (3)

Will Ness
Will Ness

Reputation: 71070

It doesn't need to. foldl is translated into foldr which translates into foldMap over Endo which means function composition which means simple nesting of the function you have supplied.

Or something. Meaning, foldl could be translated into foldMap over Dual . Endo which composes left-to-right, etc..

update: yes, the docs says:

Foldable instances are expected to satisfy the following laws:

foldr f z t = appEndo (foldMap (Endo . f) t ) z
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z  -- << --
fold = foldMap id

Dual (Endo f) <> Dual (Endo g) = Dual (Endo g <> Endo f) = Dual (Endo (g . f)). So when appEndo strikes, the chain of functions that's been built, i.e.

    ((+10) . (+9) . (+8) . (+5) . ... . (+1))

or equivalent (here, shown for the (+) case), is applied to the user-supplied value -- in your case,

                                                 0

Another thing to notice is that Endo and Dual are newtypes, so all these machinations will be done by compiler, and be gone by run time.

Upvotes: 8

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476557

Short answer: it is a type constraint in the signature of foldMap.

If we look to the source code of Foldable (more specifically foldMap), we see:

class Foldable (t :: * -> *) where
  ...
  foldMap :: Monoid m => (a -> m) -> t a -> m

So that means that if we declare Tree a member of Foldable (not that Tree has kind * -> *), it means that a foldMap is defined over that tree, such that: foldMap :: Monoid m => (a -> m) -> Tree a -> m. So this thus means that the result type (and the result of the function passed to foldMap) m must be a Monoid.

Haskell is statically typed: after compile time, Haskell knows exactly the types that are passed in an out of every function instance. So that means it knows for instance what the output type will be, and thus how to handle it.

Now Int is not an instance of Monoid. You here use F.foldl (+) 0 testTree, so that means that you more or less have constructed an "ad hoc" monoid. This works if we look at the source code of foldl:

foldl :: (b -> a -> b) -> b -> t a -> b
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z

This has a lot of logic surrounding the parameters f, z and t. So let us break that down first.

Let us first take a look at Dual . Endo . flip f. This is short for:

helper = \x -> Dual (Endo (\y -> f y x))

Dual and Endo are types with each one constructor that takes one parameter. So we wrap the outcome of f y x in the Dual (Endo ...) constructors.

We will use this as the function of foldMap. If our f has type a -> b -> a, then this function has type b -> Dual (Endo a). So the output type of the function passed to foldMap has output type Dual (Endo a). Now if we inspect the source code, we see two intersting things:

instance Monoid (Endo a) where
        mempty = Endo id
        Endo f `mappend` Endo g = Endo (f . g)

instance Monoid a => Monoid (Dual a) where
        mempty = Dual mempty
        Dual x `mappend` Dual y = Dual (y `mappend` x)

(note that it is y `mappend` x, not x `mappend` y).

So what happens here is that the mempty that is used in the foldMap is mempty = Dual mempty = Dual (Endo id). So a Dual (Endo ...) that encapsulates the identity function.

Furthermore the mappend of two duals comes down to function composition of the values in Endo. So:

mempty = Dual (Endo id)
mappend (Dual (Endo f)) (Dual (Endo g)) = Dual (Endo (g . f))

So that means that if we fold over the tree, in case we see an Empty (a leaf), we will return mempty, and in case we see a Node x l r, we will perform the mappend as described above. So a "specialized" foldMap will look like:

-- specialized foldMap
foldMap f Empty = Dual (Endo id)  
foldMap f (Node x l r) =  Dual (Endo (c . b . a))
        where Dual (Endo a) = foldMap f l
              Dual (Endo b) = helper x
              Dual (Endo c) = foldMap f l

So for every Node we make a function composition right-to-left over the the children and item of the node. a and c can be compositions of the tree as well (since these are recursive calls). In case of a Leaf, we do nothing (we return id, but the composition over an id is a no-op).

So that means that if we have a tree:

5
|- 3
|  |- 1
|  `- 6
`- 9
   |- 8
   `- 10

This will result in a function:

(Dual (Endo ( (\x -> f x 10) .
              (\x -> f x 9) .
              (\x -> f x 8) .
              (\x -> f x 5) .
              (\x -> f x 6) .
              (\x -> f x 3) .
              (\x -> f x 1)
            )
      )
)

(omitted the identities, to make it a cleaner). This is the outcome of getDual (foldMap (Dual . Endo . flip f)). But now we need to post process this result. With getDual, we get the content wrapped in the Dual constructor. So now we have:

Endo ( (\x -> f x 10) .
       (\x -> f x 9) .
       (\x -> f x 8) .
       (\x -> f x 5) .
       (\x -> f x 6) .
       (\x -> f x 3) .
       (\x -> f x 1)
     )

and with appEndo, we obtain the function wrapped in Endo, so:

( (\x -> f x 10) .
  (\x -> f x 9) .
  (\x -> f x 8) .
  (\x -> f x 5) .
  (\x -> f x 6) .
  (\x -> f x 3) .
  (\x -> f x 1)
)

and then we apply this to z the "initial" value. So that means that we will process the chain starting with z (the initial element), and apply it like:

f (f (f (f (f (f (f z 1) 3) 6) 5) 8) 9) 10

So we have constructed some sort of Monoid, where mappend is replaced by f, and mempty as a no-op (the identity function).

Upvotes: 19

Asad Saeeduddin
Asad Saeeduddin

Reputation: 46628

There is (implicitly, if not explicitly), a monoid instance for ordinary functions of the form a -> a, where mappend corresponds to function composition, and mempty corresponds to the id function.

What is (+)? It is a function (Num a) => a -> a -> a. If you foldMap over your foldable full of numbers with +, you turn each one into the partially applied (+ <some number), which is a a -> a. Lo and behold, you've found the magic f that will turn everything in your foldable into a monoid!

Assuming there was a direct monoid instance for functions, you'd be able to do:

foldMap (+) [1, 2, 3, 4]

, which would produce an ultimate (Num a) => a -> a that you could apply to 0 to get 10.

There is no such direct instance however, so you need to use the builtin newtype wrapper Endo and the corresponding unwrapper appEndo, which implement monoid for a -> a functions. Here's what it looks like:

Prelude Data.Monoid> (appEndo (foldMap (Endo . (+)) [1, 2, 3, 4])) 0
10

Here Endo . is just our annoying need to lift plain a -> as so they have their natural Monoid instance. After the foldMap is done reducing our foldable by turning everything into a -> as and chaining them together with composition, we extract the final a -> a using appEndo, and finally apply it to 0.

Upvotes: 4

Related Questions