marcosh
marcosh

Reputation: 9008

Impose nesting limits on recursive data structure

Consider a recursive data structure like the following:

data Tree level
    = Leaf String
    | Node level [ Tree level ]

Now, if level is an instance of Ord, I would like to impose at the type level the following limitation on the data structure: a node must contain only Trees with a higher level.

You can safely assume that level is a simple sum type like

Level
= Level1
| Level2
...
| LevelN

but where N is not known a priori. In this case I would be able to have that all the subnodes of a node have a higher level.

For example

tree = Node Level1
    [ Node Level2 []
    , Node Level3 []   
    ]

should compile, while

tree = Node Level2
    [ Node Level1 []
    ]

should not.

Is it possible to model such a thing in Haskell?

Upvotes: 10

Views: 427

Answers (2)

oisdk
oisdk

Reputation: 10091

So there are a number of difficulties with this question. Peano numbers are a good place to start, though:

{-# LANGUAGE DataKinds                 #-}
{-# LANGUAGE GADTs                     #-}
{-# LANGUAGE KindSignatures            #-}
{-# LANGUAGE MultiParamTypeClasses     #-}
{-# LANGUAGE FlexibleInstances         #-}
{-# LANGUAGE TypeOperators             #-}
{-# LANGUAGE FlexibleContexts          #-}
{-# LANGUAGE ConstraintKinds           #-}

data Nat = Z | S Nat

Next, we'll need some way of saying one number is "bigger" than another. We can do so by first defining an inductive class for "n is less than or equal to m"

class (n :: Nat) <= (m :: Nat)
instance Z <= n
instance n <= m => (S n <= S m)

We can then define "less than" in terms of this:

type n < m = S n <= m

Finally, here's the Tree and Levels:

data Tree n where
  Leaf :: String -> Tree n
  Node :: n < z => Level z -> [Tree z] -> Tree n

data Level n where
  Level0 :: Level Z
  Level1 :: Level (S Z)
  Level2 :: Level (S (S Z))
  Level3 :: Level (S (S (S Z)))
  Level4 :: Level (S (S (S (S Z))))

And, as desired, the first example compiles:

tree = Node Level1
    [ Node Level2 []
    , Node Level3 []   
    ]

While the second does not:

tree = Node Level2
    [ Node Level1 []
    ]

Just for extra fun, we can now add a "custom type error" (this will need UndecidableInstances:

import GHC.TypeLits (TypeError, ErrorMessage(Text))

instance TypeError (Text "Nodes must contain trees of a higher level") => S n < Z

So when you write:

tree = Node Level2
    [ Node Level1 []
    ]

You get the following:

• Nodes must contain trees of a higher level

• In the expression: Node Level1 []

In the second argument of ‘Node’, namely ‘[Node Level1 []]’

In the expression: Node Level2 [Node Level1 []]

If you want to make "level" more generic, you'll need a couple more extensions:

{-# LANGUAGE TypeApplications, RankNTypes, AllowAmbiguousTypes, TypeFamilies #-}
import qualified GHC.TypeLits as Lits

data Level n where
  Level0 :: Level Z
  LevelS :: !(Level n) -> Level (S n)
    
class HasLevel n where level :: Level n
instance HasLevel Z where level = Level0
instance HasLevel n => HasLevel (S n) where level = LevelS level
    
type family ToPeano (n :: Lits.Nat) :: Nat where
  ToPeano 0 = Z
  ToPeano n = S (ToPeano (n Lits.- 1))
    
node :: forall q z n m. (ToPeano q ~ z, HasLevel z, n < z) => [Tree z] -> Tree n
node = Node level
    
tree =
    node @1
      [ node @2 []
      , node @3 []   
      ]
    

Upvotes: 9

Silvio Mayolo
Silvio Mayolo

Reputation: 70387

Here's the basic idea. The easiest way to encode recursion limits like this is to use Peano numbers. Let's define such a type.

data Number = Zero | Succ Number

A number is either zero or the successor of another number. This is a nice way to define numbers here, as it will get along nicely with our tree recursion. Now, we want the Level to be a type, not a value. If it's a value, we can't limit its value at the type level. So we use GADTs to restrict the way we can initialize things.

data Tree (lvl :: Number) where
    Leaf :: String -> Tree lvl
    Node :: [Tree lvl] -> Tree ('Succ lvl)

lvl is the depth. A Leaf node can have any depth, but a Node node is restricted in its depth and must be strictly greater than that of its children (here, strictly one greater, which works in most simple cases. Allowing it to be strictly greater in general would require some more complicated type-level tricks, possibly with -XTypeInType). Notice that we use 'Succ at the type level. This is a promoted type, enabled with -XDataKinds. We also need -XKindSignatures to enable the :: Number constraint.

Now let's write a function.

f :: Tree ('Succ 'Zero) -> String
f _ = "It works!"

This function only takes trees that go at most one level deep. We can try to call it.

f (Leaf "A") -- It works!
f (Node [Leaf "A"]) -- It works!
f (Node [Node [Leaf "A"]]) -- Type error

So it will fail at compile-time if the depth is too much.

Complete example (including compiler extensions):

{-# LANGUAGE GADTs, KindSignatures, DataKinds #-}

data Number = Zero | Succ Number

data Tree (lvl :: Number) where
    Leaf :: String -> Tree lvl
    Node :: [Tree lvl] -> Tree ('Succ lvl)

f :: Tree ('Succ 'Zero) -> String
f _ = "It works!"

This isn't everything you can do with this. There's certainly expansions to be made, but it gets the point across and will hopefully point you in the right direction.

Upvotes: 10

Related Questions