orome
orome

Reputation: 48616

Why does Maybe include Just?

Thanks to some excellent answers here, I generally understand (clearly in a limited way) the purpose of Haskell's Maybe and that its definition is

data Maybe a = Nothing | Just a

however I'm not entity clear exactly why Just is a part of this definition. As near as I can tell, this is where Just itself is defined, but the the relevant documentation doesn't say much about it.

Am I correct is thinking that the primary benefit of using Just in the definition of Maybe, rather than simply

data Maybe a = Nothing | a

is that it allows for pattern matching to with Just _ and for useful functionality like isJust and fromJust?

Why is Maybe defined in the former way rather than the latter?

Upvotes: 26

Views: 1931

Answers (4)

Luis Casillas
Luis Casillas

Reputation: 30237

Another way to look at it (in addition to Tikhon's answer) is to consider another one of the basic Haskell types, Either, which is defined like this:

-- | A value that contains either an @a@ (the 'Left') constructor) or 
-- a @b@ (the 'Right' constructor).
data Either a b = Left a | Right b

This allows you to have values like these:

example1, example2 :: Either String Int
example1 = Left "Hello!"
example2 = Right 42

...but also like this one:

example3, example4 :: Either String String
example3 = Left "Hello!"
example4 = Right "Hello!"

The type Either String String, the first time you encounter it, sounds like "either a String or a String," and you might therefore think that it's the same as just String. But it isn't, because Haskell unions are tagged unions, and therefore an Either String String records not just a String, but also which of the "tags" (data constructors; in this case Left and Right) was used to construct it. So even though both alternatives carry a String as their payload, you're able to tell how any one value was originally built. This is good because there are lots of cases where the alternatives are the same type but the constructors/tags impart extra meaning:

data ResultMessage = FailureMessage String | SuccessMessage String

Here the data constructors are FailureMessage and SuccessMessage, and you can guess from the names that even though the payload in both cases is a String, they would mean very different things!

So bringing it back to Maybe/Just, what's happening here is that Haskell just uniformly works like that: every alternative of a union type has a distinct data constructor that must always be used to construct and pattern match values of its type. Even if at first you might think it would be possible to guess it from context, it just doesn't do it.

There are other reasons, a bit more technical. First, the rules for lazy evaluation are defined in terms of data constructors. The short version: lazy evaluation means that if Haskell is forced to peek inside of a value of type Maybe a, it will try to do the bare minimum amount of work needed to figure out whether it looks like Nothing or like Just x—preferably it won't peek inside the x when it does this.

Second: the language needs to be able distinguish types like Maybe a, Maybe (Maybe a) and Maybe (Maybe (Maybe a)). If you think about it, if we had a type definition that worked like you wrote:

data Maybe a = Nothing | a  -- NOT VALID HASKELL

...and we wanted to make a value of type Maybe (Maybe a), you wouldn't be able to tell these two values apart:

example5, example6 :: Maybe (Maybe a)
example5 = Nothing
example6 = Just Nothing

This might seem a bit silly at first, but imagine you have a map whose values are "nullable":

-- Map of persons to their favorite number.  If we know that some person
-- doesn't have a favorite number, we store `Nothing` as the value for
-- that person.
favoriteNumber :: Map Person (Maybe Int)

...and want to look up an entry:

Map.lookup :: Ord k => Map k v -> k -> Maybe v

So if we look up mary in the map we have:

Map.lookup favoriteNumber mary :: Maybe (Maybe Int)

And now the result Nothing means Mary's not in the map, while Just Nothing means Mary's in the map but she doesn't have a favorite number.

Upvotes: 19

chi
chi

Reputation: 116174

Maybe a is designed so to have one more value than the type a. In type theory, sometimes it is written as 1 + a (up to iso), which makes that fact even more evident.

As an experiment, consider the type Maybe (Maybe Bool). Here we have 1 + 1 + 2 values, namely:

Nothing
Just Nothing
Just (Just False)
Just (Just True)

If we were allowed to define

data Maybe a = Nothing | a

we would lose the distinction between the cases Just Nothing and Nothing, since there is no longer Just to make them apart. Indeed, Maybe (Maybe a) would collapse into Maybe a. This would be an inconvenient special case.

Upvotes: 6

Tikhon Jelvis
Tikhon Jelvis

Reputation: 68172

Haskell's algebraic data types are tagged unions. By design, when you combine two different types into another type, they have to have constructors to disambiguate them.

Your definition does not fit with how algebraic data types work.

data Maybe a = Nothing | a

There's no "tag" for a here. How would we tell an Maybe a apart from a normal, unwrapped a in your case?

Maybe has a Just constructor because it has to have a constructor by design.

Other languages do have union types which could work like what you imagine, but they would not be a good fit for Haskell. They play out differently in practice and tend to be somewhat error-prone.

There are some strong design reasons for preferring tagged unions to normal union types. They play well with type inference. Unions in real code often have a tag anyhow¹. And, from the point of view of elegance, tagged unions are a natural fit to the language because they are the dual of products (ie tuples and records). If you're curious, I wrote about this in a blog post introducing and motivating algebraic data types.

footnotes

¹ I've played with union types in two places: TypeScript and C. TypeScript compiles to JavaScript which is dynamically typed, meaning it keeps track of the type of a value at runtime—basically a tag.

C doesn't but, in practice, something like 90% of the uses of union types either have a tag or effectively emulate struct subtyping. One of my professors actually did an empirical study on how unions are used in real C code, but I don't remember what paper it was in off-hand.

Upvotes: 36

w.b
w.b

Reputation: 11238

Just is a constructor, a alone would be of type a, when Just a constructs a different type Maybe a.

Upvotes: 8

Related Questions