You Kuper
You Kuper

Reputation: 1113

Unique entries in HashSet<List<T>> where list might have null entries

There is a List<MyElement> = new ArrayList<MyElement>();

class MyElement {
  private Object[] values;
  //...
}

I need to find all unique entries in this List. I would use HashSet, BUT the problem is that values may contain null AND it should be assumed that null is equal to any other value. For instance, Object[] o1 = new Object[]{1,null,"s2"} and Object[] o2 = new Object[]{1,2,"s2"} should be considered as the same entries (i.e. non-unique), and only one of them should be kept in the HashSet. Is there any way to override proper functions in HashSet?

Upvotes: 4

Views: 246

Answers (2)

Mister Smith
Mister Smith

Reputation: 28188

Your problem is that null references should not be equals to anything, as the equals contract states:

For any non-null reference value x, x.equals(null) should return false.

So if your values field is meaningful for your equals implementation, then you cannot implement what you say without breaking the contract.

I'd replace the Object[] field for a List one, and implement equals in the MyElement class. This will in turn provide a meaningful equals for the list, as its contract states. Of course if you override equals, you should override hashcode as well to keep the thing consistent.

I'd leave the good old HashSet untouched, keep in mind that writting correct collections is not a trivial task, no matter how easy it can seem at first glance. So override your MyElement hashcode and equals methods to fit your needs without breaking both contracts.

Upvotes: 1

ARRG
ARRG

Reputation: 2496

Do you really need the O(1) time for add() and contains() ? I can not see a good way to write a hashCode() function for your MyElement class that would satisfy your requirements.

A Comparator (or making MyElement Comparable) however could do the trick, and you could then use a TreeSet to find out the unique elements of your list.

Here is a first attempt (you shouldn't use it as-is, it probably won't work).

class MyElementComparator implements Comparator<MyElement> {
    @Override
    public int compare(MyElement e, MyElement f) {
        int sizeCmp = e.values.length - f.values.length;

        if(sizeCmp != 0) // Lists are of different sizes, elements aren't equal
            return sizeCmp; 

        // Start comparing element by element
        for(int i=0; i<e.values.length; i++) {
            Object eo = e.values[i];
            Object fo = f.values[i];

            // Null is a wildcard
            if(eo == null || fo == null)
                continue;

            // If objects are the same, then continue too.
            if(eo == fo || eo.equals(fo))
                continue;

            // Otherwise, decide on one object or the other based on hashcode (or any other valid mean).
                return eo.hashCode() - fo.hashCode();
        }

        // All elements were equal or skipped, then the objects are equal.
        return 0;
    }
}

A quick tests seems to indicate it works:

    MyElement a = new MyElement(1, null, "s2");
    MyElement b = new MyElement(1, 2, "s2");
    MyElement c = new MyElement(null, "s", 3);

    TreeSet<MyElement> set = new TreeSet<MyElement>(new MyElementComparator());
    set.add(a);
    set.add(b);
    set.add(c);
    System.out.println(set.size()); // 2

But the thing will fail if you add to the set an element that is equal to two other elements that are different. For example {1} and {2} are different, but if you add {null}, then the set should be reduced to {null} and this won't happen.

No Comparator will achieve that, you will need another data structure, maybe a Disjoint set (Union Find) ? http://en.wikipedia.org/wiki/Disjoint-set_data_structure

Upvotes: 1

Related Questions