Inconsistent results for Data.Set and custom Ord instance

Question

Here is my data structure

data Ex =
  P String
  | (:←) Ex

It has the property that p == ←←p. My custom Eq and Ord instances attempt to define the same. However, I am seeing inconsistent results for test3 (set created from [p,←←p, ←p]) and test4 (set created from [p, ←p, ←←p]). Results as shown below:

*Test> test3
fromList [←q,←←q]
*Test> test4
fromList [q,←q,←←q]

Note that test3 and test4 only differ in the order of the elements from which the set is created. Yet, the results differ.

I think the order of the set creation using Data.Set.fromList should not really matter. Can someone help me find the mistake with my Eq or Ord instance? Full code below, compiled with GHC 8.4.3.

module Test where    
import Data.Set as S


data Ex =
  P String
  | (:←) Ex

instance Show Ex where
  show (P s) = s
  show ((:←) e) = "←" ++ (show e)

instance Eq Ex where
  (P s1) == (P s2) = s1 == s2
  (:←) e1 == (:←) e2
    | e1 == e2 = True
    | otherwise = False
  e1 == (:←) e2
    | e1 == e2 = False
    | (:←) e1 == e2 = True
    | otherwise = False
  (:←) e1 == e2
    | e1 == e2 = False
    | e1 == (:←) e2 = True
    | otherwise = False

elength :: Ex   -> Int
elength (P s) = length s
elength ((:←) e) = elength e + 1

instance Ord Ex where
  compare e1 e2
    | e1 == e2 = EQ
    | otherwise = if (elength e1) <= (elength e2) then LT
      else GT

-- Check that ←q == ←←q                                                                                               
test2 = S.fromList [(:←) ((:←) (P "q")), P "q"]
-- output should be : {←←q, ←q}                                                                                       
test3 = S.fromList [P "q",  (:←) ((:←) (P "q")), (:←) (P "q")]  
-- output should be same as that of test3 : {←←q, ←q}                                                                 
test4 = S.fromList [P "q", (:←) (P "q"), (:←) ((:←) (P "q"))]

EDIT:

Note that if I modify the elength definition to handle the case, the inconsistency is gone.

elength ((:←) ((:←) e)) = elength e
Perhaps my elength metric and == definitions are at odds in the case of q and ←←q. I would still like to know where exactly they are going wrong

amalloy · Accepted Answer

Your Eq instance certainly looks strange to me. I would unravel the cancelled-out pairings two at a time, rather than piecemeal:

instance Eq Ex where
  (P s1) == (P s2)     = s1 == s2
  ((:←) e1) == (:←) e2 = e1 == e2
  e1 == (:←) ((:←) e2) = e1 == e2
  (:←) ((:←) e1) == e2 = e1 == e2
  _ == _ = False

Maybe this is equivalent to what you have written; it is rather hard to tell because your pattern-matching does not align well with your goals.

Your Ord instance is also a problem, because you do not define a consistent ordering. For any set of items x y z, where x < y && y < z, it should be the case that x < z. However, there are easy counterexamples according to your rules:

x = P "a"
y = (P "b" :←)
z = ((P "a" :←) :←)

Here x == z despite y being in between them.

One way to fix both problems is to write a simplify function that removes all pairs of cancelling-out constructors, and use that in both Eq and Ord instances. Simplify each argument, so that you know they each have either 0 or 1 levels of negation. From there, Eq is easy, and all you need to do before you can define Ord is to decide whether a negated value should be less than or greater than a non-negated value. You can't choose for them to be equal, because that again breaks transitivity.

If you do write simplify, it will be a lot of wasted work to call simplify every time you touch one of these objects, since you immediately throw away the simplification. I'd choose not to export the constructor for this type, and instead offer a smart constructor that simplifies before returning a value. Then consumers will know everything is negated once or not at all.

Inconsistent results for Data.Set and custom Ord instance

Answers (1)

Related Questions