Reputation: 9008
In my project I have created a data type, that can hold one of a few types of values:
data PhpValue = VoidValue | IntValue Integer | BoolValue Bool
What I wanted to do now, is to have a simple way of checking if two values of the PhpValue
type are of the same constructor (correct me if I'm confused with the terminology here, but basically what I want to check if both are, for example, are IntValue
, without caring about the particular value).
Here is a function I wrote for that:
sameConstructor :: PhpValue -> PhpValue -> Bool
sameConstructor VoidValue VoidValue = True
sameConstructor (IntValue _) (IntValue _) = True
sameConstructor (BoolValue _) (BoolValue _) = True
sameConstructor _ _ = False
This works as it should, but I don't really like it: if I add more constructors (like FloatValue Float
) I am going to have to rewrite the function, and it will get bigger as my data definition gets bigger.
The Question: Is there a way of writing such a function, so that its implementation doesn't change when I add more constructors?
For the record: I don't want to change the data
definition, I have enough Monads in the rest of my code as it is ;)
Upvotes: 27
Views: 3504
Reputation: 41
Since the definition follows a regular format, you can use Template Haskell to automatically derive such a function for any datatype. I went ahead and wrote a simple package for this since I wasn't fully satisfied with the existing solutions.
First, we define a class
class EqC a where
eqConstr :: a -> a -> Bool
default eqConstr :: Data a => a -> a -> Bool
eqConstr = (==) `on` toConstr
and then a function deriveEqC :: Name -> DecsQ
which will automatically generate instances for us.
The default
is a default signature, and means that when the type is an instance of Data
we can omit the definition of eqConstr
, and fall back to Tikhon's implementation.
The benefit of Template Haskell is that it produces a more efficient function. We can write $(deriveEqC ''PhpValue)
and get an instance that is exactly what we'd write by hand. Take a look at the generated core:
$fEqCPhpValue_$ceqConstr =
\ ds ds1 ->
case ds of _ {
VoidValue ->
case ds1 of _ {
__DEFAULT -> False;
VoidValue -> True
};
IntValue ds2 ->
case ds1 of _ {
__DEFAULT -> False;
IntValue ds3 -> True
};
BoolValue ds2 ->
case ds1 of _ {
__DEFAULT -> False;
BoolValue ds3 -> True
}
}
In contrast, using Data
introduces a good deal of extra indirection by reifying an explicit Constr
for each argument before comparing them for equality:
eqConstrDefault =
\ @ a $dData eta eta1 ->
let {
f
f = toConstr $dData } in
case f eta of _ { Constr ds ds1 ds2 ds3 ds4 ->
case f eta1 of _ { Constr ds5 ds6 ds7 ds8 ds9 ->
$fEqConstr_$c==1 ds ds5
}
}
(There's a lot of other bloat involved in computing toConstr
that's not worth showing)
In practice this leads to the Template Haskell implementation being about twice as fast:
benchmarking EqC/TH
time 6.906 ns (6.896 ns .. 6.915 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 6.903 ns (6.891 ns .. 6.919 ns)
std dev 45.20 ps (32.80 ps .. 63.00 ps)
benchmarking EqC/Data
time 14.80 ns (14.77 ns .. 14.82 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 14.79 ns (14.77 ns .. 14.81 ns)
std dev 60.17 ps (43.12 ps .. 93.73 ps)
Upvotes: 2
Reputation: 48581
If you don't want to use any of the reasonable ways in the other answers, you can use a completely unsupported way that is guaranteed to be fast but not actually guaranteed to give correct results or even not to crash. Note that this will even be happy to try to compare functions, for which it will give utterly bogus results.
{-# language MagicHash, BangPatterns #-}
module DangerZone where
import GHC.Exts (Int (..), dataToTag#)
import Data.Function (on)
{-# INLINE getTag #-}
getTag :: a -> Int
getTag !a = I# (dataToTag a)
sameConstr :: a -> a -> Bool
sameConstr = (==) `on` getTag
One other problem (arguably) is that this peers through newtypes. So if you have
newtype Foo a = Foo (Maybe a)
then
sameConstr (Foo (Just 3)) (Foo Nothing) == False
even though they're built with the Foo
constructor. You can work around this by using a bit of the machinery in GHC.Generics
, but without the runtime cost associated with using unoptimized generics. This gets pretty hairy!
{-# language MagicHash, BangPatterns, TypeFamilies, DataKinds,
ScopedTypeVariables, DefaultSignatures #-}
import Data.Proxy (Proxy (..))
import GHC.Generics
import Data.Function (on)
import GHC.Exts (Int (..), dataToTag#)
--Define getTag as above
class EqC a where
eqConstr :: a -> a -> Bool
default eqConstr :: forall i q r s nt f.
( Generic a
, Rep a ~ M1 i ('MetaData q r s nt) f
, GNT nt)
=> a -> a -> Bool
eqConstr = genEqConstr
-- This is separated out to work around a bug in GHC 8.0
genEqConstr :: forall a i q r s nt f.
( Generic a
, Rep a ~ M1 i ('MetaData q r s nt) f
, GNT nt)
=> a -> a -> Bool
genEqConstr = (==) `on` modGetTag (Proxy :: Proxy nt)
class GNT (x :: Bool) where
modGetTag :: proxy x -> a -> Int
instance GNT 'True where
modGetTag _ _ = 0
instance GNT 'False where
modGetTag _ a = getTag a
The key idea here is that we look at the type-level metadata associated with the generic representation of the type to determine whether or not it's a newtype. If it is, we report its "tag" as 0
; otherwise we use its actual tag.
Upvotes: 0
Reputation: 48581
One popular alternative to Data
is Generic
. I think Data
probably makes more sense in this context, but I figured it would make sense to add this just for completeness.
{-# LANGUAGE DefaultSignatures, TypeOperators, FlexibleContexts #-}
module SameConstr where
import GHC.Generics
import Data.Function (on)
class EqC a where
eqConstr :: a -> a -> Bool
default eqConstr :: (Generic a, GEqC (Rep a)) => a -> a -> Bool
eqConstr = geqConstr `on` from
class GEqC f where
geqConstr :: f p -> f p -> Bool
{-# INLINE geqConstr #-}
geqConstr _ _ = True
instance GEqC f => GEqC (M1 i c f) where
{-# INLINE geqConstr #-}
geqConstr (M1 x) (M1 y) = geqConstr x y
instance GEqC (K1 i c)
instance GEqC (f :*: g)
instance GEqC U1
instance GEqC V1
instance (GEqC f, GEqC g) => GEqC (f :+: g) where
{-# INLINE geqConstr #-}
geqConstr (L1 x) (L1 y) = geqConstr x y
geqConstr (R1 x) (R1 y) = geqConstr x y
geqConstr _ _ = False
Upvotes: 2
Reputation: 54574
In your special case you can use the Show
magic of the compiler:
data PhpValue = VoidValue | IntValue Integer | BoolValue Bool deriving Show
sameConstructor v1 v2 = cs v1 == cs v2 where
cs = takeWhile (/= ' ') . show
Of course depending on the string representation generated by the compiler is very close to a hack...
Upvotes: -1
Reputation: 68152
Take a look at Data.Data
and its toConstr
function. This returns a representation of the constructor which can be compared for equality.
With an extension (you can put {-# LANGUAGE DeriveDataTypeable #-}
at the top of your module), you can have a Data
instance derived for you automatically:
data PhpValue = VoidValue | IntValue Integer | BoolValue Bool
deriving (Typeable, Data)
You should then be able to use the toConstr
function to compare by constructor.
Now the following will be true:
toConstr (BoolValue True) == toConstr (BoolValue False)
Using on
from Data.Function
you can now rewrite sameConstructor
to:
sameConstructor = (==) `on` toConstr
This is the same as
sameConstructor l r = toConstr l == toConstr r
I think the version using on
is easier to read at a glance.
Upvotes: 29
Reputation: 61369
This is known as the expression problem in Haskell and ML-family languages; there are a number of unsatisfactory solutions (including using Data.Typeable
and abusing typeclasses, in Haskell) but no nice solutions.
Upvotes: 6