WestCoastProjects
WestCoastProjects

Reputation: 63042

sortWith using basic character comparison fails for some strings

I want to apply a custom sorting comparator as follows:

 myString.sortWith{ case (c1,c2) => c1.compareTo(c2) <= 0}

This should sort the characters of the string by their codepoint values.

However it does not always work.. Consider a simple string cab312:

val str = "cab312"

str.sortWith{ case (c1,c2) => c1.compareTo(c2) <= 0}
res0: String = 123abc

That works fine. Consider a more complex string:

scala> val str = "TOADS POOLS hoppin good service & repair"
str: String = TOADS POOLS hoppin good service & repair

scala> str.sortWith{ case (c1,c2) => c1.compareTo(c2) <= 0}
java.lang.IllegalArgumentException: Comparison method violates its general contract!
  at java.util.TimSort.mergeHi(TimSort.java:899)
  at java.util.TimSort.mergeAt(TimSort.java:516)
  at java.util.TimSort.mergeCollapse(TimSort.java:441)
  at java.util.TimSort.sort(TimSort.java:245)
  at java.util.Arrays.sort(Arrays.java:1438)
  at scala.collection.SeqLike$class.sorted(SeqLike.scala:648)
  at scala.collection.immutable.StringOps.sorted(StringOps.scala:29)
  at scala.collection.SeqLike$class.sortWith(SeqLike.scala:601)
  at scala.collection.immutable.StringOps.sortWith(StringOps.scala:29)
  ... 32 elided

So .. we get a java.lang.IllegalArgumentException: Comparison method violates its general contract!

What is going on here? How can the same comparator both succeed and fail.. Is this a bug in the timsort - and in any case is there a workaround?

Upvotes: 2

Views: 371

Answers (1)

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

The documentation states unambiguously that sortWith expects a single function lt which returns true if and only if the first operand preceeds (is strictly Less Than) the second operand:

lt the comparison function which tests whether its first argument precedes its second argument in the desired ordering.

Your c1.compareTo(c2) <= 0 returns true for elements that are equal, and therefore violates the contract of lt. Changing <= to < eliminates the issue:

str.sortWith{ case (c1,c2) => c1.compareTo(c2) < 0}
//"      &ADLOOOPSSTacdeeeghiiinoooppprrrsv"

Upvotes: 3

Related Questions