flupe
flupe

Reputation: 183

Haskell not using the more specific instance of a typeclass

I've been having trouble the past few days figuring out whether something I'm trying to do is actually feasible in Haskell.

Here is some context: I am trying to code a little markup language (akin to ReST) where the syntax already enables custom extensions through directives. For users to implement new directives, they should be able to add new semantic constructs inside the document datatype. For exemple if one wants to add a directive for displaying math, they might want to have a MathBlock String constructor inside the ast.

Obviously data types are not extensible, and a solution where there is a generic constructor DirectiveBlock String containing the name of the directive (here, "math") is undesirable as we would like to have in our ast only well-formed constructs (so only directives with well-formed arguments).

Using type families, I prototyped something like:

{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}

-- Arguments for custom directives.
data family Args :: * -> *

data DocumentBlock
    = Paragraph String
    | forall a. Block (Args a)

Sure enough, if someone wishes to define a new directive for math display, they can do it as such:

data Math
-- The expected arguments for the math directive.
data instance Args Math = MathArgs String

doc :: [DocumentBlock]
doc =
    [ Paragraph "some text"
    , Block (MathArgs "x_{n+1} = x_{n} + 3")
    ]

So far so good, we can only construct documents where directive blocks receive the correct arguments.

The problem arises when one user wants to convert the internal representation of a document to some custom output, say, String. The user needs to provide a default output for all directives, since there will be many and some of them cannot be converted to the target. Furthermore, the user may wish to provide a more specific output for some directives:

class StringWriter a where
    write :: Args a -> String

-- User defined generic conversion for all directives.
instance StringWriter a where
   write _ = "Directive"

-- Custom way of showing the math directive.
instance StringWriter Math where
    write (MathArgs raw) = "Math(" ++ raw ++ ")"

-- Then to display a DocumentBlock
writeBlock :: DocumentBlock -> String
writeBlock (Paragraph t) = "Paragraph(" ++ t ++ ")"
writeBlock (Block args)  = write args

main :: IO ()
main = putStrLn $ writeBlock (Block (MathArgs "a + b"))

With this example, the output is Block and not Math(a+b), so the generic instance for StringWriter is always chosen. Even when playing with {-# OVERLAPPABLE #-}, nothing succeeds.

Is the kind of behavior I'm describing possible at all in Haskell?


When trying to include a generic Writer inside the Block definition, it also fails to compile.

-- ...

class Writer a o where
    write :: Args a -> o

data DocumentBlock
    = Paragraph String
    | forall a o. Writer a o => Block (Args a)

instance {-# OVERLAPPABLE #-} Writer a String where
   write _ = "Directive"

instance {-# OVERLAPS #-} Writer Math String where
    write (MathArgs raw) = "Math(" ++ raw ++ ")"

-- ...

Upvotes: 5

Views: 298

Answers (1)

chi
chi

Reputation: 116139

Your code does not compile, since Block something has type DocumentBlock, while write expects an Args a argument, and the two types are different. Did you mean writeBlock instead? I'll assume so.

What you might want to try is to add a constraint in your existential type, e.g.:

data DocumentBlock
    = Paragraph String
    | forall a. StringWriter a => Block (Args a)
             -- ^^^^^^^^^^^^^^ --

This has the following effect. Operationally, every time Block something is used, the instance is remembered (a pointer is implicitly stored along the Args a value). That will be a pointer to the catch-all instance, or to the specific one, whichever is the best fit.

When the constructor is then pattern-matched later on, the instance can then be used. Full working code:

{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}

-- Arguments for custom directives.
data family Args :: * -> *

data DocumentBlock
    = Paragraph String
    | forall a. StringWriter a => Block (Args a)

data Math
-- The expected arguments for the math directive.
data instance Args Math = MathArgs String

doc :: [DocumentBlock]
doc =
    [ Paragraph "some text"
    , Block (MathArgs "x_{n+1} = x_{n} + 3")
    ]

class StringWriter a where
    write :: Args a -> String

-- User defined generic conversion for all directives.
instance {-# OVERLAPPABLE #-} StringWriter a where
   write _ = "Directive"

-- Custom way of showing the math directive.
instance StringWriter Math where
    write (MathArgs raw) = "Math(" ++ raw ++ ")"

-- Then to display a DocumentBlock
writeBlock :: DocumentBlock -> String
writeBlock (Paragraph t) = "Paragraph(" ++ t ++ ")"
writeBlock (Block args)  = write args

main :: IO ()
main = putStrLn $ writeBlock (Block (MathArgs "a + b"))

This prints Math(a + b).

A final note: for this to work it is crucial that all the relevant instances are in scope when Block is used. Otherwise, GHC might choose the wrong instance, causing some unintended output. This is the main limitation, making overlapping instances a bit fragile in general. As long as there are no orphan instances, this should work.

Also note that, if using other existential types, a user can (intentionally or accidentally) cause GHC to pick the wrong instance anyway. For instance, if we use

data SomeArgs = forall a. SomeArgs (Args a)

toGenericInstance :: DocumentBlock -> DocumentBlock
toGenericInstance (Block a) = case SomeArgs a of
   SomeArgs a' -> Block a'  -- this will always pick the generic instance
toGenericInstance db = db

then, writeBlock (toGenericInstance (Block (MathArgs "a + b"))) will produce Directive instead.

Upvotes: 2

Related Questions