Reputation: 61
In Java8 I have an algorithm I cannot really change that orders a particular structure as follows:
It was implemented like this:
list.sort((o1, o2) -> {
boolean isXXX1 = "XXX".equals(o1.getCategory());
boolean isXXX2 = "XXX".equals(o2.getCategory());
if(isXXX1) {
return isXXX2 ? 0 : 1;
} else if(isXXX2) {
return -1;
}
return o1.hashCode() - o2.hashCode();
});
Hashcode is created by ReflectionashCode on the PoJo objects, so I do not expect weird collisions.
Occasionally, apparently at random, this piece of code throws IllegalArgumentException: Comparison method violates its general contract!
I tried to debug it, but even with the same data in debug mode it does not break. What could be the cause? I dont see any case where the sorting does not respect the contract
Upvotes: 0
Views: 114
Reputation: 298489
Never use minus as a comparator function.
Your comparator ends with return o1.hashCode() - o2.hashCode();
Unfortunately, developers come up with this idea all the time and for some tests (i.e. with small values), it might seem to work.
The problem is that comparing arbitrary integer values (and hash codes may use the entire int
value range) can lead to overflow, as the distance between two int
values may be larger than the int
value range.
The correct idiom would be return Integer.compare(o1.hashCode(), o2.hashCode());
There is no problem if a hash collision occurs, as that simply means that the two elements would be treated as equal for this particular sort operation, which is fine if you don’t care about their relative order anyway. In fact, you can treat all elements whose relative order is irrelevant as equal, replacing the last line with return 0;
.
Or, simplify the entire operation with
list.sort(Comparator.comparing(o -> "XXX".equals(o.getCategory())));
Since Boolean
values are comparable and its compareTo
method will treat true
as larger than false
, it will move the matching elements to the end of the list.
Upvotes: 2
Reputation: 40047
TL;DR
According to the Arrays.LegacyMergeSort class
, to cater to "broken"comparators, you can opt to use the legacy merge sort by setting a System property. For my tests described below, this worked.
System.setProperty("java.util.Arrays.useLegacyMergeSort", "TRUE");
Details which may allow improving your comparator.
The following program does not answer the question but it may help find it. It generates random data and then sorts using your comparator. Then it loops and shuffles the data until an exception is thrown and prints the original list and the list at the time of exception.
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Random;
public class SortingIssue {
record Info(String getCategory) {
@Override
public String toString() {
return getCategory;
}
}
public static void main(String[] args) {
Random r = new Random();
List<Info> list = new ArrayList<>();
for (int i = 0; i < 15; i++) {
int len = r.nextInt(4,8);
StringBuilder sb = new StringBuilder();
for (int k = 0; k <len; k++) {
sb.append((char)r.nextInt('A', 'Z'+1));
}
list.add(new Info(sb.toString()));
}
for (int i = 0; i < 17; i++) {
list.add(new Info("XXX"));
}
for (int i = 0; i < 1000000; i++) {
Collections.shuffle(list);
List<Info> orig = new ArrayList<>(list);
try {
list.sort((o1, o2) -> {
boolean isXXX1 = "XXX".equals(o1.getCategory());
boolean isXXX2 = "XXX".equals(o2.getCategory());
if(isXXX1) {
return isXXX2 ? 0 : 1;
} else if(isXXX2) {
return -1;
}
return o1.hashCode() - o2.hashCode();
});
} catch (Exception e) {
e.printStackTrace();
System.out.println("Original List: " + orig);
System.out.println();
System.out.println("List at exception: " + list);
break;
}
}
}
}
I did not do extensive tests but I noticed that I could not get an list < 32 elements to throw an exception which just happens to be the minimum merge limit for TimSort.
private static final int MIN_MERGE = 32;
Output
java.lang.IllegalArgumentException: Comparison method violates its general contr
act!
at java.base/java.util.TimSort.mergeHi(TimSort.java:903)
at java.base/java.util.TimSort.mergeAt(TimSort.java:520)
at java.base/java.util.TimSort.mergeCollapse(TimSort.java:448)
at java.base/java.util.TimSort.sort(TimSort.java:245)
at java.base/java.util.Arrays.sort(Arrays.java:1308)
at java.base/java.util.ArrayList.sort(ArrayList.java:1804)
at stackOverflow.sandbox.SortingIssue.main(SortingIssue.java:39)
Original List: [PLKRPMX, KAKAZPD, XXX, YVJVMH, FCKDWW, XXX, GQXRTW, XXX, XXX, NZ
PAD, XXX, XXX, XXX, IPPVZL, TLKKCJE, XXX, GHXG, GHPO, XXX, XXX, XXX, XXX, XXX, X
XX, QTRAWN, XXX, XXX, HXKD, UGNV, XXX, RRQP, JVZI]
List at exception: [YVJVMH, TLKKCJE, KAKAZPD, NZPAD, PLKRPMX, FCKDWW, GQXRTW, IP
PVZL, QTRAWN, GHPO, GHXG, HXKD, JVZI, RRQP, UGNV, XXX, XXX, XXX, XXX, XXX, XXX,
XXX, XXX, XXX, XXX, XXX, XXX, XXX, XXX, XXX, XXX, XXX]
Updated
I found this in the Arrays.class
.
/**
* Old merge sort implementation can be selected (for
* compatibility with broken comparators) using a system property.
* Cannot be a static boolean in the enclosing class due to
* circular dependencies. To be removed in a future release.
*/
static final class LegacyMergeSort {
@SuppressWarnings("removal")
private static final boolean userRequested =
java.security.AccessController.doPrivileged(
new sun.security.action.GetBooleanAction(
"java.util.Arrays.useLegacyMergeSort")).booleanValue();
}
}
Once I set the property, I no longer got the problem. The decision on which sort to use is made in Arrays.class
and depends on the property value.
System.setProperty("java.util.Arrays.useLegacyMergeSort", "TRUE");
Instead of sorting, just separate into lists with XXX
and without "XXX"
and add the "XXX"
list to the end of the other.
List<Info> result = list.stream()
.collect(Collectors.collectingAndThen(Collectors.groupingBy(
info -> info.getCategory().equals("XXX") ? "withXXX"
: "withoutXXX"),
map -> {
map.get("withoutXXX").addAll(map.get("withXXX"));
return new ArrayList<>(map.get("withoutXXX"));
}));
Upvotes: 2