Cannot parse data constructor in a data/newtype declaration

Question

I have the type Card, consists of suit and rank,

data Suit = A|B deriving (Show, Eq)
data Rank = 1|2 deriving (Show, Eq)
data Card = Card Suit Rank deriving (Show, Eq)

It seems be wrong about the data Rank function, since Int cannot be the type constructor and how to create a right function if my cards are A1|B1|A2|B2

Thank you

K. A. Buhr · Accepted Answer

It might look as if the statement:

data Suit = A | B

is only defining one thing, the type Suit, as a collection/set of arbitrary objects. Actually, though, it's defining three things: the type Suit and two constructors A and B for creating values of that type.

If the definition:

data Rank = 1 | 2

actually worked, it wouldn't be defining Rank as a collection of the numbers 1 and 2, it would be redefining the numbers 1 and 2 as constructors/values of the new type Rank, and you'd no longer be able to use them as regular numbers. (For example, the expression n + 1 would now be a type error, because (+) expects a number, and 1 would have been redefined as a Rank).

Fortunately or unfortunately, Haskell won't accept numbers as constructor names -- they need to be valid identifiers starting with uppercase letters (or operators that start with a colon).

So, there are two usual approaches to defining a type like Rank that's meant to represent some subset of numbers. The first, as noted in the comments, is to define it much like you already have, but change your numbers into valid identifiers by prefixing with an uppercase letter:

data Rank = R1 | R2

The advantage of this is that it guarantees that only valid ranks can be represented. Here, only ranks 1 and 2 are allowed. If someone tried to write R3 somewhere, it wouldn't work, because that constructor hasn't been defined. The big disadvantage is that this quickly becomes unruly. If these were playing cards, the definition would be:

data Rank = R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10 | R11 | R12 | R13

and a function to, say, assign point values to cards for rummy would look like:

points :: Rank -> Int
points R1 = 10   -- Ace worth 10
points R2 = 2
points R3 = 3
...
points R9 = 9
points R10 = 10  -- 10 and face cards all worth 10
points R11 = 10
points R12 = 10
points R13 = 10

(In real code, you'd use more advanced Haskell features like a derived Enum instance to deal with this.)

The second approach is to define your rank in terms of an existing numeric type:

data Rank = Rank Int   -- actually, `data` would probably be `newtype`

This defines two things, a type named Rank, and a constructor, also named Rank. (This is okay, as types and constructors live in different namespaces.)

In this definition, instead of Rank being defined as a discrete set of values given by explicit constructors with one constructor per value, this definition essential makes the type Rank an Int that's "tagged" with the constructor Rank.

The disadvantage of this approach is that it's now possible to create invalid ranks (since someone can write Rank 14). The advantage is that it's often easier to work with. For example, you can extract the integer from the rank, so points can be defined as:

points :: Rank -> Int
points (Rank 1)             = 10  -- Ace is worth 10
points (Rank r) | r >= 10   = 10  -- 10 and face are worth 10
                | otherwise = r   -- rest are worth their rank

Note that, with this set of definitions:

data Suit = A | B deriving (Show, Eq)
newtype Rank = Rank Int deriving (Show, Eq)
data Card = Card Suit Rank deriving (Show, Eq)

you'd construct Card value using an expression like Card A (Rank 1) for your "A1" card.

There's actually a third approach. Some people might skip defining the Rank type entirely and either write:

data Suit = A | B deriving (Show, Eq)
data Card = Card Suit Int deriving (Show, Eq)

or write the equivalent code using a type alias:

data Suit = A | B deriving (Show, Eq)
type Rank = Int
data Card = Card Suit Rank deriving (Show, Eq)

Note that the type alias here is really just for documentation. Here, Rank and Int are exactly the same type and can be used interchangeably. Using Rank just makes the code easier to understand by making it clear where the programmer intended an Int to stand for a Rank versus an integer used for some other purpose.

The main advantage of this approach is that you can avoid including the word Rank in lots of places (e.g., cards are written Card A 1 instead of Card A (Rank 1), and the definition of points wouldn't need to pattern match the argument on Rank r, etc.) The main disadvantage is that it blurs the distinction between Rank and other integers and makes it easier to make programming errors like using the Rank where you meant to use the points and vice versa.

Cannot parse data constructor in a data/newtype declaration

Answers (1)

Related Questions