James Kleeh
James Kleeh

Reputation: 12228

Java 8 Stream API - Select the lowest key after group by

I have a stream of Foo objects.

class Foo {
    private int variableCount;
    public Foo(int vars) {
        this.variableCount = vars; 
    }
    public Integer getVariableCount() { 
      return variableCount; 
    }
}

I want a list of Foo's that all have the lowest variableCount.

For example

new Foo(3), new Foo(3), new Foo(2), new Foo(1), new Foo(1)

I only want the stream to return the last 2 Foos, since they have the lowest value.

I've tried doing a collect with grouping by

.collect(Collectors.groupingBy((Foo foo) -> {
                    return foo.getVariableCount();
})

And that returns a Map<Integer, List<Foo>> and I'm not sure how to transform that into what I want.

Thanks in advance

Upvotes: 17

Views: 4272

Answers (7)

rgettman
rgettman

Reputation: 178253

Here is a solution that:

  1. Only streams the list once.
  2. Doesn't build a map or other structure that contains all of the input items (unless the variable counts are all the same), only keeping those that are currently the minimum.
  3. Is O(n) time, O(n) space. It's entirely possible that all Foos have the same variable count, in which case this solution would store all items like other solutions. But in practice, with different, varied values and higher cardinality, the number of items in the list is likely to be much lower.

Edited

I've improved my solution according to the suggestions in the comments.

I implemented an accumulator object, which supplies functions to the Collector for this.

/**
 * Accumulator object to hold the current min
 * and the list of Foos that are the min.
 */
class Accumulator {
    Integer min;
    List<Foo> foos;

    Accumulator() {
        min = Integer.MAX_VALUE;
        foos = new ArrayList<>();
    }

    void accumulate(Foo f) {
        if (f.getVariableCount() != null) {
            if (f.getVariableCount() < min) {
                min = f.getVariableCount();
                foos.clear();
                foos.add(f);
            } else if (f.getVariableCount() == min) {
                foos.add(f);
            }
        }
    }

    Accumulator combine(Accumulator other) {
        if (min < other.min) {
            return this;
        }
        else if (min > other.min) {
            return other;
        }
        else {
            foos.addAll(other.foos);
            return this;
        }
    }

    List<Foo> getFoos() { return foos; }
}

Then all we have to do is collect, referencing the accumulator's methods for its functions.

List<Foo> mins = foos.stream().collect(Collector.of(
    Accumulator::new,
    Accumulator::accumulate,
    Accumulator::combine,
    Accumulator::getFoos
    )
);

Testing with

List<Foo> foos = Arrays.asList(new Foo(3), new Foo(3), new Foo(2), new Foo(1), new Foo(1), new Foo(4));

The output is (with a suitable toString defined on Foo):

[Foo{1}, Foo{1}]

Upvotes: 11

davidxxx
davidxxx

Reputation: 131326

To avoid creating the map you could use two streams :

  • the first finds the minimum value.
  • the second filters elements with this value.

It could give :

List<Foo> foos = ...;
int min = foos.stream()
              .mapToInt(Foo::getVariableCount)
              .min()
              .orElseThrow(RuntimeException::new); // technical error

List<Foo> minFoos = foos.stream()
    .filter(f -> f.getVariableCount() == min)
    .collect(Collectors.toList());

Upvotes: 1

tsolakp
tsolakp

Reputation: 5948

Here is alternative with one stream and custom reducer. The idea is to first sort and then collect only elements with first min value:

    List<Foo> newlist = list.stream()
    .sorted( Comparator.comparing(Foo::getVariableCount) )
    .reduce( new ArrayList<>(), 
         (l, f) -> { 
             if ( l.isEmpty() || l.get(0).getVariableCount() == f.getVariableCount() ) l.add(f); 
             return l;
         }, 
         (l1, l2) -> {
             l1.addAll(l2); 
             return l1;
         } 
    );

Or using collect is even more compact:

    List<Foo> newlist = list.stream()
    .sorted( Comparator.comparing(Foo::getVariableCount) )
    .collect( ArrayList::new, 
         (l, f) -> if ( l.isEmpty() || l.get(0).getVariableCount() == f.getVariableCount() ) l.add(f),
         List::addAll
    );

Upvotes: 1

Vinay Prajapati
Vinay Prajapati

Reputation: 7504

You could use collect wisely on the sorted list and in accumulator add the logic to add only either first element to empty list or add any other Foo having variable count same as of the first element of the list.

A complete working example below:-

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;

class Foo {
    private int variableCount;

    public Foo(int vars) {
        this.variableCount = vars;
    }

    public Integer getVariableCount() {
        return variableCount;
    }

    public static void main(String[] args) {
        List<Foo> list = Arrays.asList(
                new Foo(2),
                new Foo(2),
                new Foo(3),
                new Foo(3),
                new Foo(1),
                new Foo(1)
        );

        System.out.println(list.stream()
                .sorted(Comparator.comparing(Foo::getVariableCount))
                .collect(() -> new ArrayList<Foo>(),
                        (ArrayList<Foo> arrayList, Foo e) -> {
                            if (arrayList.isEmpty()
                                    || arrayList.get(0).getVariableCount() == e.getVariableCount()) {
                                arrayList.add(e);
                            }
                        },
                        (ArrayList<Foo> foos, ArrayList<Foo> foo) -> foos.addAll(foo)
                )

        );
    }

    @Override
    public String toString() {
        return "Foo{" +
                "variableCount=" + variableCount +
                '}';
    }
}

Also, you could first find the minimum variableCount in one stream and use that inside filter of another stream.

    list.sort(Comparator.comparing(Foo::getVariableCount));
    int min = list.get(0).getVariableCount();
    list.stream().filter(foo -> foo.getVariableCount() == min)
            .collect(Collectors.toList());

I think in any case either sorting is required or a way to find the minimum number which later can be used inside the predicate. Even if you are using the map to group the values.

Cheers!

Upvotes: 1

Eugene
Eugene

Reputation: 120848

IF you are OK streaming (iterating) twice:

private static List<Foo> mins(List<Foo> foos) {
    return foos.stream()
            .map(Foo::getVariableCount)
            .min(Comparator.naturalOrder())
            .map(x -> foos.stream()
                          .filter(y -> y.getVariableCount() == x)
                          .collect(Collectors.toList()))
            .orElse(Collections.emptyList());
}

Upvotes: 6

James Kleeh
James Kleeh

Reputation: 12228

To avoid creating the entire map and also avoiding streaming twice, I copied a custom collector from here https://stackoverflow.com/a/30497254/1264846 and modified it to work with min instead of max. I didn't even know custom collectors were possible so I thank @lexicore for pointing me in that direction.

This is the resulting function minAll

public static <T, A, D> Collector<T, ?, D> minAll(Comparator<? super T> comparator,
                                                  Collector<? super T, A, D> downstream) {
    Supplier<A> downstreamSupplier = downstream.supplier();
    BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
    BinaryOperator<A> downstreamCombiner = downstream.combiner();
    class Container {
        A acc;
        T obj;
        boolean hasAny;

        Container(A acc) {
            this.acc = acc;
        }
    }
    Supplier<Container> supplier = () -> new Container(downstreamSupplier.get());
    BiConsumer<Container, T> accumulator = (acc, t) -> {
        if(!acc.hasAny) {
            downstreamAccumulator.accept(acc.acc, t);
            acc.obj = t;
            acc.hasAny = true;
        } else {
            int cmp = comparator.compare(t, acc.obj);
            if (cmp < 0) {
                acc.acc = downstreamSupplier.get();
                acc.obj = t;
            }
            if (cmp <= 0)
                downstreamAccumulator.accept(acc.acc, t);
        }
    };
    BinaryOperator<Container> combiner = (acc1, acc2) -> {
        if (!acc2.hasAny) {
            return acc1;
        }
        if (!acc1.hasAny) {
            return acc2;
        }
        int cmp = comparator.compare(acc1.obj, acc2.obj);
        if (cmp < 0) {
            return acc1;
        }
        if (cmp > 0) {
            return acc2;
        }
        acc1.acc = downstreamCombiner.apply(acc1.acc, acc2.acc);
        return acc1;
    };
    Function<Container, D> finisher = acc -> downstream.finisher().apply(acc.acc);
    return Collector.of(supplier, accumulator, combiner, finisher);
}

Upvotes: 1

lexicore
lexicore

Reputation: 43651

You can use a sorted map for grouping and then just get the first entry. Something along the lines:

Collectors.groupingBy(
    Foo::getVariableCount,
    TreeMap::new,
    Collectors.toList())
.firstEntry()
.getValue()

Upvotes: 15

Related Questions