thor
thor

Reputation: 22460

difference between variable definition in a Haskell source file and in GHCi?

In a Haskell source file, I can write

a = 1

and I had the impression that I have to write the same in GHCi as

let a = 1

, for a = 1 in GHCi gives a parse error on =.

Now, if I write

a = 1
a = 2

in a source file, I will get an error about Multiple declaration of a, but it is OK to write in GHCi:

let a = 1
let a = 2

Can someone help clarify the difference between the two styles?

Upvotes: 0

Views: 288

Answers (2)

Ben
Ben

Reputation: 71400

There is a key difference in Haskell in having two definitions of the same name and identical scopes, and having two definitions of the same name in nested scopes. GHCi vs modules in a file isn't really related to the underlying concept here, but those situations do lead you to encounter problems if you're not familiar with it.

A let-expression (and a let-statement in a do block) creates a set of bindings with the same scope, not just a single binding. For example, as an expression:

let a = True
    a = False
in  a

Or with braces and semicolons (more convenient to paste into GHCi without turning on multi-line mode):

let { a = True; a = False} in a

This will fail, whether in a module or in GHCi. There cannot be a single variable a that is both True and False, and there can't be two separate variables named a in the same scope (or it would be impossible to know which one was being referred to by the source text a).

The variables in a single binding set are all defined "at once"; the order they're written in is not relevant at all. You can see this because it's possible to define mututally-recursive bindings that all refer to each other, and couldn't possibly be defined one-at-a-time in any order:

λ let a = True : b
|     b = False : a
| in  take 10 a
[True,False,True,False,True,False,True,False,True,False]
it :: [Bool]

Here I've defined an infinite list of alternating True and False, and used it to come up with a finite result.

A Haskell module is a single scope, containing all the definitions in the file. Exactly as in a let-expression with multiple bindings, all the definitions "happen at once"1; they're only in a particular order because writing them down in a file inevitably introduces an order. So in a module this:

a = True
a = False

gives you an error, as you've seen.

In a do-block you have let-statements rather than let-expressions.2 These don't have an in part since they just scope over the entire rest of the do-block.3 GHCi commands are very like entering statements in an IO do-block, so you have the same option there, and that's what you're using in your example.

However your example has two let-bindings, not one. So there are two separate variables named a defined in two separate scopes.

Haskell doesn't care (almost ever) about the written order of different definitions, but it does care about the "nesting order" of nested scopes; the rule is that when you refer to a variable a, you get the inner-most definition of a whose scope contains the reference.4

As an aside, hiding an outer-scope name by reusing a name in an inner scope is known as shadowing (we say the inner definition shadows the outer one). It's a useful general programming term to know, since the concept comes up in many languages.

So it's not that the rules about when you can define a name twice are different in GHCi vs a module, its just that the different context makes different things easier.

If you want to put a bunch of definitions in a module, the easy thing to do is make them all top-level definitions, which all have the same scope (the whole module) and so you get an error if you use the same name twice. You have to work a bit more to nest the definitions.

In GHCi you're entering commands one-at-a-time, and it's more work to use multi-line commands or braces-and-semicolon style, so the easy thing when you want to enter several definitions is to use several let statements, and so you end up shadowing earlier definitions if you reuse names.5 You have to more deliberately try to actually enter multiple names in the same scope.


1 Or more accurately the bindings "just are" without any notion of "the time at which they happen" at all.

2 Or rather: you have let-statements as well as let-expressions, since statements are mostly made up of expressions and a let-expression is always valid as an expression.

3 You can see this as a general rule that later statements in a do-block are conceptually nested inside all earlier statements, since that's what they mean when you translate them to monadic operations; indeed let-statements are actually translated to let-expressions with the rest of the do-block inside the in part.

4 It's not ambiguous like two variables with the same name in the same scope would be, though it is impossible to refer to any further-out definitions.

5 And note that anything you've previously defined referring to the name before the shadowing will still behave exactly as it did before, referring to the previous name. This includes functions that return the value of the variable. It's easiest to understand shadowing as introducing a different variable that happens to have the same name as an earlier one, rather than trying to understand it as actually changing what the earlier variable name refers to.

Upvotes: 0

chepner
chepner

Reputation: 530803

Successive let "statements" in the interactive interpreter are really the equivalent of nested let expressions. They behave as if there is an implied in following the assignment, and the rest of the interpreter session comprises the body of the let. That is

>>> let a = 1
>>> let a = 1
>>> print a

is the same as

let a = 1 in
let a = 1 in
print a

Upvotes: 4

Related Questions