Reputation: 621
I'm having trouble grappling with how to turn a sum type into a string without doing a whole bunch of pattern matching. A [simplified] example might be something like this:
data Shape = Rectangle Int Float Float |
Circle Int Float |
Ellipse Int Float Float
deriving (Show, Eq)
What I want is a function renderShape :: Shape -> String
that takes a Shape
and gives me a string that represents the argument in a certain way (I'm generating lines in an input file for a simulation program). Note that I would like this to be distinct from show
, as it is doing a bit of extra formatting beyond just turning the data structure into a string. What I actually need it to do is kind of complicated, but for example purposes (because if you can explain generally how to do this I think I can work out the rest), let's just say I want to stick a semicolon on the end and separate with a comma e.g.
renderShape $ Rectangle 1 2 2
>> "Rectangle,1,2,2;"
Specifically, what I'm struggling with is that the different constructors for the type have varying number of arguments. I could do something with pattern matching, like
renderShape (Rectangle i x y) = (intercalate "," ["Rectangle", show i, show x, show y]) ++ ";"
renderShape (Circle i x) = ...
...
but I'd prefer not to as in reality I have a ton of these types of shapes so it'd be awfully tedious.
I've found something that "works" by abusing the derived show
instance and simply transforming the string, but it seems really ugly:
{-# LANGUAGE OverloadedStrings #-}
import Data.List (intercalate)
renderShape :: Shape -> String
renderShape s = intercalate "," theWords ++ ";"
where theWords = (words . show) s
So my question is, what's the "nice" way to do this? It seems to me like there has to be a simple and clean way to go about it, but I can't figure it out for the life of me. I'm also not hugely experienced with Haskell, so if my whole approach is fundamentally wrong or non-idiomatic then I'd welcome alternatives.
Upvotes: 1
Views: 262
Reputation: 10228
This is an ideal usage of generics. Our strategy will be to convert the datatype into its generic representation (from :: (Generic a) => a -> Rep a
) and then recurse into Rep a
. Rep a
is actually a type function, so let's see what it looks like:
λ> :info Shape
[lots of garbage]
type instance Rep Shape
= D1
Main.D1Shape
(C1
Main.C1_0Shape
(S1 NoSelector (Rec0 Int)
:*: (S1 NoSelector (Rec0 Float) :*: S1 NoSelector (Rec0 Float)))
:+: (C1
Main.C1_1Shape
(S1 NoSelector (Rec0 Int) :*: S1 NoSelector (Rec0 Float))
:+: C1
Main.C1_2Shape
(S1 NoSelector (Rec0 Int)
:*: (S1 NoSelector (Rec0 Float) :*: S1 NoSelector (Rec0 Float)))))
Oof. We're in higher-kinded territory here. To traverse this data structure we'll need to recurse into the type itself. Whereas value recursion works by repeatedly calling a function on smaller and smaller parts of a data structure, we will create a one-method class called Jason
and then instantiate the method on smaller and smaller parts of the type.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeOperators #-}
import GHC.Generics
data Shape = Rectangle Int Float Float |
Circle Int Float |
Ellipse Int Float Float
deriving (Show, Eq, Generic)
class Jason a where
jasonShow :: a -> String
-- Integers can use the regular show.
instance Jason Float where jasonShow = show
-- So can floats.
instance Jason Int where jasonShow = show
-- Constant values are easy.
instance (Jason c) => Jason (K1 i c p) where jasonShow = jasonShow . unK1
-- Use generics to pattern match into constructors.
instance (Jason (f p), Constructor c) => Jason (M1 C c f p) where
jasonShow constructor@(M1 x) = conName constructor ++ "," ++ jasonShow x
-- We don't care about selectors.
instance (Jason (f p)) => Jason (M1 S c f p) where
jasonShow (M1 x) = jasonShow x
-- Or whether something is a datatype or not.
instance (Jason (f p)) => Jason (M1 D c f p) where
jasonShow (M1 x) = jasonShow x
-- We don't care about the index into the disjoint union (a.k.a. "|") of a datatype.
instance (Jason (f p), Jason (g p)) => Jason ((f :+: g) p) where
jasonShow (L1 x) = jasonShow x
jasonShow (R1 x) = jasonShow x
-- We want to insert a comma when we encounter a product in the datatype.
instance (Jason (f p), Jason (g p)) => Jason ((f :*: g) p) where
jasonShow (x :*: y) = jasonShow x ++ "," ++ jasonShow y
renderShape :: Shape -> String
renderShape = (++ ";") . jasonShow . from
To output commas for products and the name of the constructor, we special cased those instances:
M1 C c f p
is a type for the meta-information (M
) for a constructor (C
) wrapped around a value f p
. Calling conName
on a value of this type will helpfully return a string representing the name of the constructor.
(f :*: g) p
is a type for the product of two values of type f p
and g p
. Earlier, when we queried for the type of our rectangle, you can see this appear in a couple of places. For example M1 S NoSelector (Rec0 Float) :*: M1 S NoSelector (Rec0 Float)
(I expanded out the type S1 = M1 S
alias). You can think of :*:
as the glue between the numbers in Rectangle 1 2 2
.
Our base cases in the recursion are Int
and Float
. We can keep defining more base cases, but our shape is just ints and floats.
This is pretty weird code! You don't see so so many esoteric datatypes with weird two-character names everyday. But we did arrive at this nice result:
λ> renderShape (Rectangle 1 2 2)
"Rectangle,1,2.0,2.0;"
λ> renderShape (Circle 1 2)
"Circle,1,2.0;"
λ> renderShape (Ellipse 1 2 2)
"Ellipse,1,2.0,2.0;"
It's also a lot of code. But it's generic code, which means you can reuse it with other, non-Shape datatypes.
DeriveGeneric: allows derive (Generic)
, which automatically instances Generic
for our datatype Shape
.
FlexibleContexts: without this, we can't say instance (Jason (f p), Constructor c) => Jason (M1 C c f p)
. The Haskell spec disallows constraints like Jason (f p)
. Jason p
is OK, but Jason (f p)
is no go. Fortunately, GHC is flexible.
FlexibleInstances: allows destruct M1 i c f p
into three different instances: M1 C c f p
, M1 D c f p
, and M1 S c f p
. Normally disallowed by the spec.
TypeOperators: allows the usage of infix type functions :*:
and :+:
.
This implementation of Show (just regular Show) with generics
You could've invented GHC.Generics – a great tutorial
Upvotes: 3
Reputation: 91867
You could derive Data.Data
for your ADT, which exposes a generic descriptor for any custom data type. Then you can write a function that renders any instance of Data
in the not-quite-like-Show fashion you need. It's certainly overkill for a single smallish type like Shape, but you could reuse it for multiple, larger types.
I'm not totally familiar with the API for doing so, but http://chrisdone.com/posts/data-typeable ends with an example of writing a generic function with a signature close to gshow :: Data d => d -> String
.
Upvotes: 2