Andre Couture
Andre Couture

Reputation: 107

Use of stream, filter and average on list and jdk8

I have this list of data that look like this;

{id, datastring}

{1,"a:1|b:2|d:3"}
{2,"a:2|c:2|c:4"}
{3,"a:2|bb:2|a:3"}
{4,"a:3|e:2|ff:3"}

What I need to do here is to do operations like average or find all id for which a element in the string is less than a certain value.

Here are some example;

Averages

{a,2}{b,2}{bb,2}{c,3}{d,3}{e,2}{ff,3}

Find all id's where c<4

{2}

Find all id's where a<3

{1,2,3}

Would this be a good use of stream() and filter() ??

Upvotes: 1

Views: 2376

Answers (2)

Eugene
Eugene

Reputation: 120968

I know that you have your answer, but here are my versions too :

 Map<String, Double> result = list.stream()
            .map(Data::getElements)
            .flatMap((Multimap<String, Integer> map) -> {
                return map.entries().stream();
            })
            .collect(Collectors.groupingBy(Map.Entry::getKey,
                    Collectors.averagingInt((Entry<String, Integer> token) -> {
                        return token.getValue();
                    })));
    System.out.println(result);

    List<Integer> result2 = list.stream()
            .filter((Data data) -> {
                return data.getElements().get("c").stream().anyMatch(i -> i < 4);
            })
            .map(Data::getId)
            .collect(Collectors.toList());
    System.out.println(result2);

Upvotes: 0

Alexis C.
Alexis C.

Reputation: 93872

Yes you can use stream operations to achieve that but I would suggest to create a class for this datas, so that each row corresponds to one specific instance. That will make your life easier IMO.

class Data {
    private int id;
    private Map<String, List<Integer>> map;
    ....
}

That said let's take a look at how you could implement this. First, the find all's implementation:

public static Set<Integer> ids(List<Data> list, String value, Predicate<Integer> boundPredicate) {
    return list.stream()
               .filter(d -> d.getMap().containsKey(value))
               .filter(d -> d.getMap().get(value).stream().anyMatch(boundPredicate))
               .map(d -> d.getId())
               .collect(toSet());
}

This one is simple to read. You get a Stream<Data> from the list. Then you apply a filter such that you only get instances that have the value given in the map, and that there is a value which satisfies the predicate you give. Then you map each instance to its corresponding id and you collect the resulting stream in a Set.

Example of call:

Set<Integer> set = ids(list, "a", value -> value < 3);

which outputs:

[1, 2, 3]

The average request was a bit more tricky. I ended up with another implementation, you finally get a Map<String, IntSummaryStatistics> at the end (which does contain the average) but also other informations.

Map<String, IntSummaryStatistics> stats = list.stream()
                .flatMap(d -> d.getMap().entrySet().stream())
                .collect(toMap(Map.Entry::getKey,
                               e -> e.getValue().stream().mapToInt(i -> i).summaryStatistics(),
                               (i1, i2) -> {i1.combine(i2); return i1;}));

You first get a Stream<Data>, then you flatMap each entry set of each map to have Stream<Entry<String, List<Integer>>. Now you collect this stream into a map for which each key is mapped by the entry's key and each List<Integer> is mapped by its corresponding IntSummaryStatistics value. If you have two identical keys, you combine their respective IntSummaryStatistics values.

Given you data set, you get a Map<String, IntSummaryStatistics>

ff => IntSummaryStatistics{count=1, sum=3, min=3, average=3.000000, max=3}
bb => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}
a => IntSummaryStatistics{count=5, sum=11, min=1, average=2.200000, max=3}
b => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}
c => IntSummaryStatistics{count=2, sum=6, min=2, average=3.000000, max=4}
d => IntSummaryStatistics{count=1, sum=3, min=3, average=3.000000, max=3}
e => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}

from which you can easily grab the average.


Here's a full working example, the implementation can certainly be improved though.

Upvotes: 1

Related Questions