Reputation: 3080
In Haskell id
function is defined on type level as id :: a -> a
and implemented as just returning its argument without any modification, but if we have some type introspection with TypeApplications
we can try to modify values without breaking type signature:
{-# LANGUAGE AllowAmbiguousTypes, ScopedTypeVariables, TypeApplications #-}
module Main where
class TypeOf a where
typeOf :: String
instance TypeOf Bool where
typeOf = "Bool"
instance TypeOf Char where
typeOf = "Char"
instance TypeOf Int where
typeOf = "Int"
tweakId :: forall a. TypeOf a => a -> a
tweakId x
| typeOf @a == "Bool" = not x
| typeOf @a == "Int" = x+1
| typeOf @a == "Char" = x
| otherwise = x
This fail with error:
"Couldn't match expected type ‘a’ with actual type ‘Bool’"
But I don't see any problems here (type signature satisfied):
My question is:
id
function must not to do any modifications on term level. Or can we have many implementations of id :: a -> a
function (I see that in practice we can, I can implement such a function in Python for example, but what the theory behind Haskell says to this?)Upvotes: 7
Views: 255
Reputation: 71485
The other answers are both very good for covering the ways you can do something like this in Haskell. But I thought it was worth adding something speaking more to this part of the question:
- If we can't, that is theoretical\philosophical etc reasons for this?
Actually Haskellers do generally rely quite strongly on the theory that forbids something like your tweakId
from existing with type forall a. a -> a
. (Even though there are ways to cheat, using things like unsafeCoerce
; this is usually considered bad style if you haven't done something like in leftaroundabout's answer, where a class with functional dependencies ensures the unsafe coerce is always valid)
Haskell uses parametric polymorphism1. That means we can write code that works on multiple types because it will treat them all the same; the code only uses operations that will work regardless of the specific type it is invoked on. This is expressed in Haskell types by using type variables; a function with a variable in its type can be used with any type at all substituted for the variable, because every single operation in the function definition will work regardless of what type is chosen.
About the simplest example is indeed the function id
, which might be defined like this:
id :: forall a. a -> a
id x = x
Because it's parametrically polymorphic, we can simply choose any type at all we like and use id
as if it was defined on that type. For example as if it were any of the following:
id :: Bool -> Bool
id x = x
id :: Int -> Int
id x = x
id :: Maybe (Int -> [IO Bool]) -> Maybe (Int -> [IO Bool])
id x = x
But to ensure that the definition does work for any type, the compiler has to check a very strong restriction. Our id
function can only use operations that don't depend on any property of any specific type at all. We can't call not x
because the x
might not be a Bool
, we can't call x + 1
because the x
might not be a number, can't check whether x
is equal to anything because it might not be a type that supports equality, etc, etc. In fact there is almost nothing you can do with x
in the body of id
. We can't even ignore x
and return some other value of type a
; this would require us to write an expression for a value that can be of any type at all and the only things that can do that are things like undefined
that don't evaluate to a value at all (because they throw exceptions). It's often said that in fact there is only one valid function with type forall a. a -> a
(and that is id
)2.
This restriction on what you can do with values whose type contains variables isn't just a restriction for the sake of being picky, it's actually a huge part of what makes Haskell types useful. It means that just looking at the type of a function can often tell you quite a bit about what it can possibly do, and once you get used to it Haskellers rely on this kind of thinking all the time. For example, consider this function signature:
map :: forall a b. (a -> b) -> [a] -> [b]
Just from this type (and the assumption that the code doesn't do anything dumb like add in extra undefined
elements of the list) I can tell:
map
will have no other way of producing values of type b
to put in the list (except undefined
, etc).map
will have no way of getting any a
values to feed to the function (except undefined
, etc)map
ultimately has no way of testing any property of the a
and b
values to decide which order they should go in. For example it might leave out the third element, or swap the 2nd and 76th elements if there are at least 100 elements, etc. But if it applies rules like that it will have to always apply them, regardless of the actual items in the list. It cannot e.g. drop the 4th element if it is less than the 5th element, or only keep outputs from the function that are "truthy", etc.None of this would be true if Haskell allowed parametrically polymorphic types to have Python-like definitions that check the type of their arguments and then run different code. Such a definition for map
could check if the function is supposed to return integers and if so return [1, 2, 3, 4]
regardless of the input list, etc. So the type checker would be enforcing a lot less (and thus catching fewer mistakes) if it worked this way.
This kind of reasoning is formalised in the concept of free theorems; it's literally possible to derive formal proofs about a piece of code from its type (and thus get theorems for free). You can google this if you're interested in further reading, but Haskellers generally use this concept informally rather doing real proofs.
Sometimes we do need non-parametric polymorphism. The main tool Haskell provides for that is type classes. If a type variable has a class constraint, then there will be an interface of class methods provided by that constraint. For example the Eq a
constraint allows (==) :: a -> a -> Bool
to be used, and your own TypeOf a
constraint allows typeOf @a
to be used. Type class methods do allow you to run different code for different types, so this breaks parametricity. Even just adding Eq a
to the type of map
means I can no longer assume property 3 from above.
map :: forall a b. Eq a => (a -> b) -> [a] -> [b]
Now map
can tell whether some of the items in the original list are equal to each other, so it can use that to decide whether to include them in the result, and in what order. Likewise Monoid a
or Monoid b
would allow map
to break the first two properties by using mempty :: a
to produce new values that weren't in the list originally or didn't come from the function. If I add Typeable
constraints I can't assume anything, because the function could do all of the Python-style checking of types to apply special-case logic, make use of existing values it knows about if a
or b
happen to be those types, etc.
So something like your tweakId
cannot be given the type forall a. a -> a
, for theoretical reasons that are also extremely practically important. But if you need a function that behaves like your tweakId
adding a class constraint was the right thing to do to break out of the constraints of parametricity. However simply being able to get a String
for each type isn't enough; typeOf @a == "Int"
doesn't tell the type checker that a
can be used in operations requiring an Int
. It knows that in that branch the equality check returned True
, but that's just a Bool
; the type checker isn't able to reason backwards to why this particular Bool
is True
and deduce that it could only have happened if a
were the type Int
. But there are alternative constructs using GADTs that do give the type checker additional knowledge within certain code branches, allowing you to check types at runtime and use different code for each type. The class Typeable
is specifically designed for this, but it's a hammer that completely bypasses parametricity; I think most Haskellers would prefer to keep more type-based reasoning intact where possible.
1 Parametric polymorphism is in contrast to class-based polymorphism you may have seen in OO languages (where each class says how a method is implemented for objects of that specific class), or ad-hoc polymophism (as seen in C++) where you simply define multiple definitions with the same name but different types and the types at each application determine which definition is used. I'm not covering those in detail, but the key distinction is both of them allow the definition to have different code for each supported type, rather than guaranteeing the same code will process all supported types.
2 It's not 100% true that there's only one valid function with type forall a. a -> a
unless you hide some caveats in "valid". But if you don't use any unsafe features (like unsafeCoerce
or the foreign language interface), then a function with type forall a. a -> a
will either always throw an exception or it will return its argument unchanged.
The "always throws an exception" isn't terribly useful so we usually assume an unknown function with that type isn't going to do that, and thus ignore this possibility.
There are multiple ways to implement "returns its argument unchanged", like id x = head . head . head $ [[[x]]]
, but they can only differ from the normal id
in being slower by building up some structure around x
and then immediately tearing it down again. A caller that's only worrying about correctness (rather than performance) can treat them all the same.
Thus, ignoring the "always undefined" possibility and treating all of the dumb elaborations of id x = x
the same, we come to the perspective where we can say "there's only one function with forall a. a -> a
".
Upvotes: 4
Reputation: 116139
You need GADTs for that.
{-# LANGUAGE ScopedTypeVariables, TypeApplications, GADTs #-}
import Data.Typeable
import Data.Type.Equality
tweakId :: forall a. Typeable a => a -> a
tweakId x
| Just Refl <- eqT @a @Int = x + 1
-- etc. etc.
| otherwise = x
Here we use eqT @type1 @type2
to check whether the two types are equal. If they are, the result is Just Refl
and pattern matching on that Refl
is enough to convince the type checker that the two types are indeed equal, so we can use x + 1
since x
is now no longer only of type a
but also of type Int
.
This check requires runtime type information, which we usually do not have due to Haskell's type erasure property. The information is provided by the Typeable
type class.
This can also be achieved using a user-defined class like your TypeOf
if we make it provide a custom GADT value. This can work well if we want to encode some constraint like "type a
is either an Int
, a Bool
, or a String
" where we statically know what types to allow (we can even recursively define a set of allowed types in this way). However, to allow any type, including ones that have not yet been defined, we need something like Typeable
. That is also very convenient since any user-defined type is automatically made an instance of Typeable
.
Upvotes: 10
Reputation: 120711
This fail with error: "Couldn't match expected type ‘a’ with actual type ‘Bool’"
But I don't see any problems here
Well, what if I add this instance:
instance TypeOf Float where
typeOf = "Bool"
Do you see the problem now? Nothing prevents somebody from adding such an instance, no matter how silly it is. And so the compiler can't possibly make the assumption that having checked typeOf @a == "Bool"
is sufficient to actually use x
as being of type Bool
.
You can squelch the error if you are confident that nobody will add malicious instances, by using unsafe coercions.
import Unsafe.Coerce
tweakId :: forall a. TypeOf a => a -> a
tweakId x
| typeOf @a == "Bool" = unsafeCoerce (not $ unsafeCoerce x)
| typeOf @a == "Int" = unsafeCoerce (unsafeCoerce x+1 :: Int)
| typeOf @a == "Char" = unsafeCoerce (unsafeCoerce x :: Char)
| otherwise = x
but I would not recommend this. The correct way is to not use strings as a poor man's type representation, but instead the standard Typeable
class which is actually tamper-proof and comes with suitable GADTs so you don't need manual unsafe coercions. See chi's answer.
As an alternative, you could also use type-level strings and a functional dependency to make the unsafe coercions safe:
{-# LANGUAGE DataKinds, FunctionalDependencies
, ScopedTypeVariables, UnicodeSyntax, TypeApplications #-}
import GHC.TypeLits (KnownSymbol, symbolVal)
import Data.Proxy (Proxy(..))
import Unsafe.Coerce
class KnownSymbol t => TypeOf a t | a->t, t->a
instance TypeOf Bool "Bool"
instance TypeOf Int "Int"
tweakId :: ∀ a t . TypeOf a t => a -> a
tweakId x = case symbolVal @t Proxy of
"Bool" -> unsafeCoerce (not $ unsafeCoerce x)
"Int" -> unsafeCoerce (unsafeCoerce x + 1 :: Int)
_ -> x
The trick is that the fundep t->a
makes writing another instance like
instance TypeOf Float "Bool"
a compile error right there.
Of course, really the most sensible approach is probably to not bother with any kind of manual type equality at all, but simply use the class right away for the behaviour changes you need:
class Tweakable a where
tweak :: a -> a
instance Tweakable Bool where
tweak = not
instance Tweakable Int where
tweak = (+1)
instance Tweakable Char where
tweak = id
Upvotes: 8