Tim Diels
Tim Diels

Reputation: 3416

Typeclasses: function with default implementation vs separate function

When defining a typeclass, how do you decide between including/excluding a function in the typeclass' definition? For example, what are the differences between these 2 cases:

class Graph g where
    ...

    insertNode :: g -> Node -> g
    insertNode graph node = ...

vs

class Graph g where
    ...

insertNode :: (Graph g) => g -> Node -> g
insertNode graph node = ...

Upvotes: 16

Views: 2746

Answers (2)

Luis Casillas
Luis Casillas

Reputation: 30227

I think there are a few elements in tension here. There's the general idea that type class definitions ought to be minimal, and only contain independent functions. As bhelkir's answer explains, if your class supports functions a, b and c, but c can be implemented in terms of a and b, that's an argument for defining c outside of the class.

But this general idea runs into a few other conflicting issues.

First, there is often more than one minimal set of operations that can equivalently define the same class. The classic definition of Monad in Haskell is this (cleaned up a bit):

class Monad m where
    return :: a -> m a
    (>>=) :: m a -> (a -> m b) -> m b

But it's well known that there are alternative definitions, like this one:

class Applicative m => Monad m where
    join :: m (m a) -> m a

return and >>= are sufficient to implement join, but fmap, pure and join are also sufficient to implement >>=.

A similar thing with Applicative. This is the canonical Haskell definition:

class Functor f => Applicative f where
    pure  :: a -> f a
    (<*>) :: f (a -> b) -> f a -> f b

But any of the following is equivalent:

class Functor f => Applicative f where
    unit  :: f ()
    (<*>) :: f (a -> b) -> f a -> f b

class Functor f => Applicative f where
    pure  :: a -> f a
    fpair :: f a -> f b -> f (a, b)

class Functor f => Applicative f where
    unit  :: f ()
    fpair :: f a -> f b -> f (a, b)

class Functor f => Applicative f where
    unit  :: f ()
    liftA2 :: (a -> b -> c) -> f a -> f b -> f c

Given any of these class definitions, you can write any of the methods in any of the others as a derived function outside the class. Why was the first one picked? I can't answer authoritatively, but I think it brings us to the third point: performance considerations. The fpair operation in many of those combines the f a and f b values by creating tuples, but for most uses of the Applicative class we don't actually want those tuples, we just want to combine values drawn from f a and f b; the canonical definition allows us to choose what function to do this combination with.

Another performance consideration is that even if some methods in a class may be definable in terms of others, these generic definitions may not be optimal for all instances of the class. If we take Foldable as an example, foldMap and foldr are interdefinable, but some types support one more efficiently than the other. So often we have non-minimal class definitions to allow the instances to provide optimized implementations of methods.

Upvotes: 12

bheklilr
bheklilr

Reputation: 54058

Including a function in the definition of a typeclass means that it can be overridden. In this case, you would need for it to be inside the Graph typeclass since it returns a Graph g => g, and each specific instance of Graph would need to know how to construct that value. Alternatively, you could specify a function in the typeclass for the purpose of constructing values of type Graph g => g and then insertNode could use that function in its result.

Keeping a function outside of the typeclass means that it can't be modified, but also that it doesn't clutter up the class. Consider as an example the mapM function. There's no need for this to be in the Monad class, and you probably don't want people writing their own implementations of mapM, it should do the same thing in all contexts. As another example, consider the function

-- f(x) = 1 + 3x^2 - 5x^3 + 10x^4
aPoly :: Num a => a -> a
aPoly x = 1 + 3 * x * x - 5 * x * x * x + 10 * x * x * x * x

Obviously aPoly shouldn't be part of the Num typeclass, it's just a random function that happens to use Num methods. It has nothing to do with what it means to be a Num.

Really, it comes down to design. Functions are usually specified within a typeclass if they are integral to what it means to be an instance of that typeclass. Sometimes functions are included in a typeclass but with a default definition so that a specific type can overload it to make it more efficient, but for the most part it makes sense to keep class members at a minimum. One way to look at it is by asking the question "Can this function be implemented with only the constraint of the class?" If the answer is no, it should be in the class. If the answer is yes, the vast majority of the time it means that the function should be moved outside of the class. Only when there is value gained from being able to overload it should it be moved into the class. If overloading that function can break other code that expects it to behave a particular way, then it shouldn't be overloaded.

Another case to consider is when you have functions in your typeclass that have sane defaults, but those defaults are mutually dependent. Taking the Num class as an example, you have

class Num a where
    (+) :: a -> a -> a
    (*) :: a -> a -> a
    (-) :: a -> a -> a
    a - b = a + negate b
    negate :: a -> a
    negate a = 0 - a
    abs :: a -> a
    signum :: a -> a
    fromInteger :: Integer -> a

Notice that (-) and negate are both implemented in terms of each other. If you make your own numeric type then you would need to implement either or both of (-) and negate, since otherwise you'll have an infinite loop on your hands. These are useful functions to overload, though, so they both stay inside the typeclass.

Upvotes: 13

Related Questions