Z-Y.L
Z-Y.L

Reputation: 1769

Are there any efficient data structure to represent Set except binary searching tree

As we know, in Haskell, a set of elements is represented by a binary searching tree in Data.Set module, which is efficient. However, most operations require the elememt to be an instance of the Ord class.

However, a general set doesn't need its elememt to be an instance of the Ord. Since a set is where there is no duplicate elements, so its element being an instance of the Eq class is just enough.

In Haskell, I can only think of the implementation by a single linked list, just like the default [a], but the single linked list are not as efficient as a BST, while a BST needs the Ord class.

By the way, in Python, the set class doesn't needs its element to be orderable. Only __eq__ and __ne__ (which is silimar to Haskell's Eq class) defined is enough, e.g.:

class Fruit:
    def __init__(self, name):
        self.name = name.title()

    def __repr__(self):
        return self.name

    def __eq__(self, other):
        return self.name == other.name     # defines the equality operation

    def __ne__(self, other):
        return self.name != other.name     # defines the unequality operation

    def __hash__(self):
        return hash(self.name)     # so that Fruit instance can be an element of a set

?: x = Fruit('apple')

?: y = Fruit('apple')

?: z = Fruit('Apple')

?: {x,y,z}

{Apple}

?: x <= y

Traceback (most recent call last):

File "", line 1, in module

x <= y

TypeError: '<=' not supported between instances of 'Fruit' and 'Fruit'

So I wonder whether there are some efficient data structure in Haskell that could be used to represent the Set but does not need the Ord class.

Upvotes: 1

Views: 117

Answers (1)

John F. Miller
John F. Miller

Reputation: 27217

Python is cheating. All values in Python are hashable and set() is just a dictionary with no value.

Python can get away with this as a side effect of being dynamically typed language. The same machinery that lets Python keep track of the types of it variables also makes it easy to impose an psudo-ordering on them.

Haskell as a statically typed language espically a statically typed language with first class functions cannot invent an arbitrary ordering for a data type. Therefore Data.Set requires Ord to be efficient. Without ordering adding a new value to a set becomes O(n).

Upvotes: 3

Related Questions