Zinedine
Zinedine

Reputation: 111

Why doesn't Haskell's Data.Set support infinite sets?

In the following snippet:

import qualified Data.Set as Set

data Nat = Zero | Succ Nat deriving (Eq, Show, Ord)

instance Enum Nat where
  pred (Succ x)     = x
  succ x            = Succ x
  toEnum 0          = Zero
  toEnum x          = Succ (toEnum (x-1))
  fromEnum Zero     = 0
  fromEnum (Succ x) = 1 + (fromEnum x)

nats :: [Nat]
nats = [Zero ..]

natSet :: Set.Set Nat
natSet = Set.fromList nats

Why does:

but

Upvotes: 11

Views: 1945

Answers (5)

KarlsFriend
KarlsFriend

Reputation: 745

I felt the same way so I added a set that works with infinite Lists. However they need to be sorted, so my algorithm knowns when to stop looking for more.

Prelude> import Data.Set.Lazy
Prelude Data.Set.Lazy> natset = fromList [1..]
Prelude Data.Set.Lazy> 100 `member` natset
True
Prelude Data.Set.Lazy> (-10) `member` natset
False

Its on hackage. http://hackage.haskell.org/package/lazyset-0.1.0.0/docs/Data-Set-Lazy.html

Upvotes: 0

rampion
rampion

Reputation: 89093

Let's see what happens when we adapt GHC's Set code to accommodate infinite sets:

module InfSet where

data InfSet a = Bin a (InfSet a) (InfSet a) 

-- create an infinite set by unfolding a value
ofUnfold :: (x -> (x, a, x)) -> x -> InfSet a
ofUnfold f x = 
  let (lx,a,rx) = f x
      l = ofUnfold f lx
      r = ofUnfold f rx
  in Bin a l r

-- check for membership in the infinite set
member :: Ord a => a -> InfSet a -> Bool
member x (Bin y l r) = case compare x y of
                         LT -> member x l
                         GT -> member x r
                         EQ -> True       

-- construct an infinite set representing a range of numbers
range :: Fractional a => (a, a) -> InfSet a
range = ofUnfold $ \(lo,hi) -> 
          let mid = (hi+lo) / 2
          in ( (lo, mid), mid, (mid, hi) )

Note how, instead of constructing the infinite set from an infinite list, I instead define a function ofUnfold to unfold a single value into an infinite list. It allows us to construct both branches lazily in parallel (we don't need to finish one branch before constructing another).

Let's give it a whirl:

ghci> :l InfSet
[1 of 1] Compiling InfSet           ( InfSet.hs, interpreted )
Ok, modules loaded: InfSet.
ghci> let r = range (0,128)
ghci> member 64 r
True
ghci> member 63 r
True
ghci> member 62 r
True
ghci> member (1/2) r
True
ghci> member (3/4) r
True

Well, that seems to work. What if we try a value outside of the Set?

ghci> member 129 r
^CInterrupted.

That will just run and run and never quit. There's no stopping branches in the inifinite set, so the search never quits. We could check the original range somehow, but that's not practical for infinite sets of discrete elements:

ghci> let ex = ofUnfold (\f -> ( f . (LT:), f [EQ], f . (GT:) )) id
ghci> :t ex
ex :: InfSet [Ordering]
ghci> member [EQ] ex
True
ghci> member [LT,EQ] ex
True
ghci> member [EQ,LT] ex
^CInterrupted.

So infinite sets are possible but I'm not sure they're useful.

Upvotes: 1

Dan Burton
Dan Burton

Reputation: 53705

The existing answers are sufficient, but I want to expound a little bit on the behavior of Sets.

Looks like you are hoping for a lazy set of all Nats; you take the infinite list of all Nats and use Set.toList on it. That would be nice; mathematicians often talk in terms of the "set of all natural numbers". The problem is the implementation of Set is not as accommodating of laziness as lists are.

The implementation of Set is based on size balanced binary trees (or trees of bounded balance)

The docs for Data.Set

Suppose you wish to lazily construct a binary tree from a list. Elements from the list would only be inserted into the tree when a deeper traversal of the tree was necessary. So then you ask if 100 is in the tree. It would go along adding numbers 1-99 to the tree, one at a time. Then it would finally add 100 to the tree, and discover that 100 is indeed an element in the tree. But notice what we did. We just performed an in-order traversal of the lazy list! So the first time, our imaginary LazyTree.contains would have roughly the same complexity as List.find (assuming an ammortized O(1) insert, which is a bad assumption for a simple binary tree, which would have O(log n) complexity). And without balancing, our tree would be very lopsided (we added the numbers 1 through 100 in order, so it would just be a big linked list down the right child of each branch). But with tree balancing during the traversal, it would be hard to know where to begin the traversal again; at least it certainly wouldn't be immediately intuitive.

tl;dr: nobody (afaik) has made a good lazy Set yet. So infinite Sets are more easily represented as infinite lists, for now.

Upvotes: 15

Sjoerd Visscher
Sjoerd Visscher

Reputation: 12000

Set.fromList is not lazy, so it will not end if you pass it an infinite list. But natSet is not constructed until it is needed, so you only notice it when you run Set.member on it.

For example, even Set.null $ Set.fromList [0..] does not terminate.

Upvotes: 8

sepp2k
sepp2k

Reputation: 370292

You can't have infinite sets. This doesn't just affect Set.member, whenever you do anything which will cause natSet to be evaluated even one step (even Set.null), it will go into an infinite loop.

Upvotes: 4

Related Questions