Ihor Rybak
Ihor Rybak

Reputation: 3279

Distinguish objects by different fields for different contexts

Let say there is such immutable class:

public class Foo {
    private final Long id;
    private final String name;
    private final LocalDate date;

    public Foo(Long id, String name, LocalDate date) {
        this.id = id;
        this.name = name;
        this.date = date;
    }

    public Long getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public LocalDate getDate() {
        return date;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Foo foo = (Foo) o;
        return Objects.equals(getId(), foo.getId()) &&
                Objects.equals(getName(), foo.getName()) &&
                Objects.equals(getDate(), foo.getDate());
    }

    @Override
    public int hashCode() {
        return Objects.hash(getId(), getName(), getDate());
    }
}

There is a collection of objects of this class. In some cases, it is required to distinguish only by name and in some cases by name and date.

So pass collection to java.util.Set<Foo> or create java 8 Stream<Foo> calling .distinct() method is not working for this case. I know it is possible to distinguish using TreeSet and Comparator. It looks like this:

private Set<Foo> distinct(List<Foo> foos, Comparator<Foo> comparator) {
    TreeSet<Foo> treeSet = new TreeSet<>(comparator);
    treeSet.addAll(foos);
    return treeSet;
}

usage:

distinct(foos, Comparator.comparing(Foo::getName)); // distinct by name
distinct(foos, Comparator.comparing(Foo::getName).thenComparing(Foo::getDate)); // distinct by name and date

But I think that is not a good way to do it. What's the most elegant way to solve this problem?

Upvotes: 2

Views: 125

Answers (1)

fps
fps

Reputation: 34460

First, let's consider your current approach, then I'll show a better alternative.

Your current approach is succinct, yet uses a TreeMap when all you need is a TreeSet. If you are OK with the O(nlogn) complexity imposed by the red/black tree structure of TreeMap, I would only change your current code to:

public static <T> Set<T> distinct(
        Collection<? extends T> list, 
        Comparator<? super T> comparator) {

    Set<T> set = new TreeSet<>(comparator);
    set.addAll(list);
    return set;
}

Note that I've made your method generic and static, so that it can be used in a generic way for any collection, no matter the type of its elements. I've also changed the first argument to Collection, so that it can be used with more data structures.

Also, TreeSet still has O(nlogn) time complexity because it uses a TreeMap as its backing structure.


The usage of TreeSet has 3 disadvantages: first, it sorts your elements according to the passed Comparator (maybe you don't need this); second, time complexity is O(nlogn) (which might be way too much if all you require is to have distinct elements); and third, it returns a Set (which might not be the type of collection the caller needs).

So, here's another approach that returns a Stream, which you can then collect to the data-structure you want:

public static <T> Stream<T> distinctBy(
        Collection<? extends T> list, 
        Function<? super T, ?>... extractors) {

    Map<List<Object>, T> map = new LinkedHashMap<>();  // preserves insertion order
    list.forEach(e -> {
        List<Object> key = new ArrayList<>();
        Arrays.asList(extractors)
                .forEach(f -> key.add(f.apply(e)));    // builds key
        map.merge(key, e, (oldVal, newVal) -> oldVal); // keeps old value
    });
    return map.values().stream();
}

This converts every element of the passed collection to a list of objects, according to the extractor functions passed as the varargs argument.

Then, each element is put into a LinkedHashMap with this key and merged by means of preserving the initially put value (change this as per your needs).

Finally, a stream is returned from the values of the map, so that the caller can do whatever she wants with it.

Note: this approach requires that all the objects returned by the extractor functions implement the equals and hashCode methods consistently, so that the list formed by them can be safely used as the key of the map.

Usage:

List<Foo> result1 = distinctBy(foos, Foo::getName)
    .collect(Collectors.toList());

Set<Foo> result2 = distinctBy(foos, Foo::getName, Foo::getDate)
    .collect(Collectors.toSet());

Upvotes: 3

Related Questions