Typeclasses: function with default implementation vs separate function

Question

When defining a typeclass, how do you decide between including/excluding a function in the typeclass' definition? For example, what are the differences between these 2 cases:

class Graph g where
    ...

    insertNode :: g -> Node -> g
    insertNode graph node = ...

vs

class Graph g where
    ...

insertNode :: (Graph g) => g -> Node -> g
insertNode graph node = ...

Luis Casillas · Accepted Answer

I think there are a few elements in tension here. There's the general idea that type class definitions ought to be minimal, and only contain independent functions. As bhelkir's answer explains, if your class supports functions a, b and c, but c can be implemented in terms of a and b, that's an argument for defining c outside of the class.

But this general idea runs into a few other conflicting issues.

First, there is often more than one minimal set of operations that can equivalently define the same class. The classic definition of Monad in Haskell is this (cleaned up a bit):

class Monad m where
    return :: a -> m a
    (>>=) :: m a -> (a -> m b) -> m b

But it's well known that there are alternative definitions, like this one:

class Applicative m => Monad m where
    join :: m (m a) -> m a

return and >>= are sufficient to implement join, but fmap, pure and join are also sufficient to implement >>=.

A similar thing with Applicative. This is the canonical Haskell definition:

class Functor f => Applicative f where
    pure  :: a -> f a
    (<*>) :: f (a -> b) -> f a -> f b

But any of the following is equivalent:

class Functor f => Applicative f where
    unit  :: f ()
    (<*>) :: f (a -> b) -> f a -> f b

class Functor f => Applicative f where
    pure  :: a -> f a
    fpair :: f a -> f b -> f (a, b)

class Functor f => Applicative f where
    unit  :: f ()
    fpair :: f a -> f b -> f (a, b)

class Functor f => Applicative f where
    unit  :: f ()
    liftA2 :: (a -> b -> c) -> f a -> f b -> f c

Given any of these class definitions, you can write any of the methods in any of the others as a derived function outside the class. Why was the first one picked? I can't answer authoritatively, but I think it brings us to the third point: performance considerations. The fpair operation in many of those combines the f a and f b values by creating tuples, but for most uses of the Applicative class we don't actually want those tuples, we just want to combine values drawn from f a and f b; the canonical definition allows us to choose what function to do this combination with.

Another performance consideration is that even if some methods in a class may be definable in terms of others, these generic definitions may not be optimal for all instances of the class. If we take Foldable as an example, foldMap and foldr are interdefinable, but some types support one more efficiently than the other. So often we have non-minimal class definitions to allow the instances to provide optimized implementations of methods.

Typeclasses: function with default implementation vs separate function

Answers (2)

Related Questions