Reputation: 8427
When I play with checking types of functions in Haskell with :t
, for example like those in my previous question, I tend to get results such as:
Eq a => a -> [a] -> Bool
(Ord a, Num a, Ord a1, Num a1) => a -> a1 -> a
(Num t2, Num t1, Num t, Enum t2, Enum t1, Enum t) => [(t, t1, t2)]
It seems that this is not such a trivial question - how does the Haskell interpreter pick literals to symbolize typeclasses? When would it choose a
rather than t
? When would it choose a1
rather than b
? Is it important from the programmer's point of view?
Upvotes: 6
Views: 930
Reputation: 101989
The names of the type variables aren't significant. The type:
Eq element => element -> [element] -> Bool
Is exactly the same as:
Eq a => a -> [a] -> Bool
Some names are simply easier to read/remember.
Now, how can an inferencer choose the best names for types?
Disclaimer: I'm absolutely not a GHC developer. However I'm working on a type-inferencer for Haskell in my bachelor thesis.
During inferencing the names chosen for the variables aren't probably that readable. In fact they are almost surely something along the lines of _N
with N
a number or aN
with N
a number.
This is due to the fact that you often have to "refresh" type variables in order to complete inferencing, so you need a fast way to create new names. And using numbered variables is pretty straightforward for this purpose.
The names displayed when inference is completed can be "pretty printed". The inferencer can rename the variables to use a
, b
, c
and so on instead of _1
, _2
etc.
The trick is that most operations have explicit type signatures. Some definitions require to quantify some type variables (class
, data
and instance
for example).
All these names that the user explicitly provides can be used to display the type in a better way.
When inferencing you can somehow keep track of where the fresh type variables came from, in order to be able to rename them with something more sensible when displaying them to the user.
An other option is to refresh variables by adding a number to them. For example a fresh type of return
could be Monad m0 => a0 -> m0 a0
(Here we know to use m
and a
simply because the class
definition for Monad
uses those names). When inferencing is finished you can get rid of the numbers and obtain the pretty names.
In general the inferencer will try to use names that were explicitly provided through signatures. If such a name was already used it might decide to add a number instead of using a different name (e.g. use b1
instead of c
if b
was already bound).
There are probably some other ad hoc rules. For example the fact that tuple elements have like t
, t1
, t2
, t3
etc. is probably something done with a custom rule. In fact t
doesn't appear in the signature for (,,)
for example.
Upvotes: 8
Reputation: 48611
How does GHCi pick names for type variables? explains how many of these variable names come about. As Ganesh Sittampalam pointed out in a comment, something strange seems to be happening with arithmetic sequences. Both the Haskell 98 report and the Haskell 2010 report indicate that
[e1..] = enumFrom e1
GHCi, however, gives the following:
Prelude> :t [undefined..]
[undefined..] :: Enum t => [t]
Prelude> :t enumFrom undefined
enumFrom undefined :: Enum a => [a]
This makes it clear that the weird behavior has nothing to do with the Enum
class itself, but rather comes in from some stage in translating the syntactic sequence to the enumFrom
form. I wondered if maybe GHC wasn't really using that translation, but it really is:
{-# LANGUAGE NoMonomorphismRestriction #-}
module X (aoeu,htns) where
aoeu = [undefined..]
htns = enumFrom undefined
compiled using ghc -ddump-simpl enumlit.hs
gives
X.htns :: forall a_aiD. GHC.Enum.Enum a_aiD => [a_aiD]
[GblId, Arity=1]
X.htns =
\ (@ a_aiG) ($dEnum_aiH :: GHC.Enum.Enum a_aiG) ->
GHC.Enum.enumFrom @ a_aiG $dEnum_aiH (GHC.Err.undefined @ a_aiG)
X.aoeu :: forall t_aiS. GHC.Enum.Enum t_aiS => [t_aiS]
[GblId, Arity=1]
X.aoeu =
\ (@ t_aiV) ($dEnum_aiW :: GHC.Enum.Enum t_aiV) ->
GHC.Enum.enumFrom @ t_aiV $dEnum_aiW (GHC.Err.undefined @ t_aiV)
so the only difference between these two representations is the assigned type variable name. I don't know enough about how GHC works to know where that t
comes from, but at least I've narrowed it down!
Ørjan Johansen has noted in a comment that something similar seems to happen with function definitions and lambda abstractions.
Prelude> :t \x -> x
\x -> x :: t -> t
but
Prelude> :t map (\x->x) $ undefined
map (\x->x) $ undefined :: [b]
In the latter case, the type b
comes from an explicit type signature given to map
.
Upvotes: 3
Reputation: 49105
Are you familiar with the concepts of alpha equivalence and alpha substitution? This captures the notion that, for example, both of the following are completely equivalent and interconvertible (in certain circumstances) even though they differ:
\x -> (x, x)
\y -> (y, y)
The same concept can be extended to the level of types and type variables (see "System F" for further reading). Haskell in fact has a notion of "lambdas at the type level" for binding type variables, but it's hard to see because they're implicit by default. However, you can make them explicit by using the ExplicitForAll
extension, and play around with explicitly binding your type variables:
ghci> :set -XExplicitForAll
ghci> let f x = x; f :: forall a. a -> a
In the second line, I use the forall
keyword to introduce a new type variable, which is then used in a type.
In other words, it doesn't matter whether you choose a
or t
in your example, as long as the type expressions satisfy alpha-equivalence. Choosing type variable names so as to maximize human convenience is an entirely different topic, and probably far more complicated!
Upvotes: 0