Hans.Olo
Hans.Olo

Reputation: 67

Filter generic collection with lambda

Rephrased question at the end
--------------------------------------------

I have to write two generic methods, to filter Collections of generic types.

First removeDuplicates and optional nulls from Collection< ? extends Object >.

Second removeDuplicates and optional empty/blank from Collection< String >.

private <T extends Collection<C>, C extends Object> T  removeDuplicatesFromCollection(T collection, boolean skipNull)
{
    if(collection == null)
    {
        return collection;
    }

    final HashSet<C> tmp = new HashSet<C>();

    for(C element : collection)
    {
        if(element == null && skipNull)
        {
            continue;
        }
        else
        {
            tmp.add(element);
        }
    }

    collection.clear();
    collection.addAll(tmp);

    return collection;
}

I did not know how to solve this otherwise. So i use a HashSet, to filter for unique, clear the origin collection and add all entries of hashset.

First i tried to solve this by streaming and using lambda expressions, but got also here the problem here to handle with collect()

First i tried this by streaming and using lambda expressions, but did not g

it works, but I'm sure there must be a better way.

The first method is to remove duplicates and empty entries from Collection. I ilke to filter this list again with lambda instructions and apache StringUtils, but I did not know how to deal with the collect and generic type T.

private <T extends Collection<String>> T removeDuplicationsFromStringCollection(T collection, boolean skipNull, boolean skipEmpty)
{
    if(collection == null)
    {
        return collection;
    }

    T removedDuplicates = removeDuplicatesFromCollection(collection, skipNull);

    if(removedDuplicates == null || !skipEmpty)
    {
        return removedDuplicates;
    }
    else
    {
        removedDuplicates.().filter(s -> StringUtils.isNotBlank(s)).collect(T::new); <== [X Cannot instantiate the type T ]
    }
}

Can anyone help to get this work? If someone has a better solution for the first method, I would also be grateful, because I am not yet completely satisfied with it myself.

Many Thanks

------------[ EDIT ]----------------

I do not know the kind of collection from method input, put the output-type have to be of the input-type of collections. HashSet => method => HashSet => return ArrayList => method => ArrayList => return

I did not like to modify the input collection, but this was the only way i kept the type of collections. Problem isn't the filtering the list during streaming, but the collect( ) at the end of stream, and creating an generic type depending on inputtype. Problem is keeping type of input collection for output collection.

--------------[ Rephrased ]-----------------
Let me rephrase and simplify the question, probably I shouldn't have written so much around what I want to do. There were any useful hints and tips, but the core problem isn't solved

private <T extends Collection> T  doFilter(T pCollection, ....
{
    /** removed code for simplifing **/
    return pCollection.stream()
    /*.filter(....)*/
    .collect(  Collectors.toCollection( ?>>  T  <<?? ) );
}

Can anyone tell me how to collect(Collectors.toCollection(T)) in a lambda with a generic type?

So List<Car> => doFilter() => List<Car>
ArrayList<Person> => doFilter() => ArrayList<Person>
Set<Contact> => doFilter() => Set<Contact>
LinkedHashSet<Cat> => doFilter() => LinkedHashSet<Cat>

[T => doFilter() => T]
....

Upvotes: 0

Views: 609

Answers (3)

fps
fps

Reputation: 34460

I think the main problem is to somehow know which specific type of collection you are dealing with, so you can create a new collection of this specific type and add the filtered values to it. It is bad practice to mutate your arguments, i.e. you shouldn't modify the received collection.

So I would use a Supplier<T> as a factory to create your specific collection:

private <T extends Collection<C>, C> T removeDuplicatesFromCollection(
        T collection, 
        boolean skipNull,
        Supplier<? extends T> factory) {

    if (collection == null) return null;

    // Remove duplicates, but keep insertion order (that's what LinkedHashSet is for)
    Set<C> noDuplicates = new LinkedHashSet<>(collection);

    // Remove nulls from the set if the flag is on
    if (skipNull) noDuplicates.removeIf(Objects::isNull);

    // Use the factory to create an empty instance of the collection 
    T newCollection = factory.get();

    // Add the no duplicates set, possibly without nulls
    newCollection.addAll(noDuplicates);

    return newCollection;
}

Now you can use the above method as follows:

List<Integer> list = Arrays.asList(1, 2, 3, null, 5, 6, 1);

ArrayList<Integer> noDupsNoNulls = removeDuplicatesFromCollection(
        list,
        true,
        ArrayList::new); // [1, 2, 3, 5, 6]

EDIT: If you already have methods that do the same for every concrete collection type, i.e. one for List<C>, another one for Set<C>, etc, you wouldn't need to change all the invocations to those methods. All you'd need to do is change their implementation, so that they now point to the new method above.

For example, assuming you already have the following method:

private <C> HashSet<C> removeDuplicatesFromCollection(
        HashSet<C> collection, 
        boolean skipNull) {

    // .....

}

You could change it as follows:

private <C> HashSet<C> removeDuplicatesFromCollection(
        HashSet<C> collection, 
        boolean skipNull) {

    return removeDuplicatesFromCollection(collection, skipNull, HashSet::new);
}

And if you have i.e. another method for ArrayList:

private <C> ArrayList<C> removeDuplicatesFromCollection(
        ArrayList<C> collection, 
        boolean skipNull) {

    // .....

}

You could change it as follows:

private <C> ArrayList<C> removeDuplicatesFromCollection(
        ArrayList<C> collection, 
        boolean skipNull) {

    return removeDuplicatesFromCollection(collection, skipNull, ArrayList::new);
}

Upvotes: 0

Mykhailo Moskura
Mykhailo Moskura

Reputation: 2211

You can use distinct() method on Stream

 Collection<String> list = Arrays.asList("A", "B", "C", "D", "A", "B", "C");
   // Get collection without duplicate i.e. distinct only
    List<String> distinctElements = 
    list.stream().distinct().collect(Collectors.toList());

This will return all disctinct elements which are not null:

public <T extends Collection<Object>> T removeDuplicatesFromCollection(T collection) {
        return  collection.stream().filter(Objects::nonNull).distinct().collect(Collectors.toList());
        }

You can also use this example with Supplier:

  public static <T,U extends Collection<T>> U removeDuplicatesFromCollection(Iterable<T> iterable , Supplier<U> collectionType) {
        return  StreamSupport.stream(iterable.spliterator(),false).filter(Objects::nonNull).distinct().collect(Collectors.toCollection(collectionType));
    }
  List<String> strings = Arrays.asList("A","B",null,"B");
        removeDuplicatesFromCollection(strings, ArrayList::new).forEach(System.out::println);

Upvotes: 0

Blokje5
Blokje5

Reputation: 4993

Using the java 8 streaming API indeed leads to a cleaner implementation:

public <T extends Collection<Object> T removeDuplicatesFromCollectionAndFilterNull(T collection) {
  return collection.stream()
    .filter(Objects::isNull) // Method Reference to Objects.isNull(object)
    .collect(Collectors.toSet()); // Collect to set to remove duplicates
}

A similar approach can be used to solve the second problem. Be aware that the collect method does not take in a lambda to create the collection (and T::new wouldn't work as Java does not know wether the method has a default constructor), but a Collector. This guide might help

Upvotes: 2

Related Questions