Daniel
Daniel

Reputation: 971

Java 8, Stream of Integer, Grouping indexes of a stream by the Integers?

I got a stream of Integers, and I would like to group the indexes of the elements by each element's value.
For example, {1, 1, 1, 2, 3, 3, 4} is grouped as Integer to list of indexes mapping:

1 -> 0, 1, 2
2 -> 3
3 -> 4, 5
4 -> 6

I have tried using stream, but with an additional class:

@Test
public void testGrouping() throws Exception {
    // actually it is being read from a disk file
    Stream<Integer> nums = Stream.of(1, 1, 1, 2, 3, 3, 4);  
    // list to map by index
    int[] ind = {0};  // capture array, effectively final
    class Pair {
        int left;
        int right;

        public Pair(int left, int right) {
            this.left = left;
            this.right = right;
        }
    }

    Map<Integer, List<Integer>> map = nums.map(e -> new Pair(ind[0]++, e))
            .collect(Collectors.groupingBy(e -> e.right))
            .entrySet().parallelStream()
            .collect(Collectors.toConcurrentMap(
                    Map.Entry::getKey,
                    e -> e.getValue().parallelStream().map(ee -> ee.left).collect(Collectors.toList())
            ));
}

I have to read Stream since the Stream of Integer is read from a disk file in my application.
I feel my way of doing it as above is pretty sub-optimal. Is there is a better or more elegant way to do it?
Thanks for your help.

Upvotes: 10

Views: 10305

Answers (4)

user2336315
user2336315

Reputation: 16067

Why not:

Stream<Integer> nums = Stream.of(1, 1, 1, 2, 3, 3, 4);  

OfInt indexes = IntStream.iterate(0, x -> x + 1).iterator();
Map<Integer, List<Integer>> result = new HashMap<>();

nums.iterator().forEachRemaining(i -> result.merge(i, 
                                                   new ArrayList<>(Arrays.asList(indexes.next())), 
                                                   (l1, l2) -> {l1.addAll(l2); return l1;})
                                 );

Result:

{1=[0, 1, 2], 2=[3], 3=[4, 5], 4=[6]}

Upvotes: 2

Peter Lawrey
Peter Lawrey

Reputation: 533680

What you can do is

Map<Integer, List<Integer>> map = nums.map(e -> new Pair(ind[0]++, e))
        .collect(groupingBy(p -> p.right, HashMap::new, 
                            mapping(p -> p.left, toList())));

This allows you to apply a mapping the the elements before they are added to the List.

Upvotes: 0

Konstantin Yovkov
Konstantin Yovkov

Reputation: 62864

  1. You can use the IntStream#range(int startInclusive, int endExclusive) method to get the index of each element.
  2. Then use the IntStream.boxed() method to convert the IntStream to a Stream with boxed Integers
  3. Group by mapping each index to the corresponding element from the array i -> array[i] and collecting the repeating elements into a list.

For example:

int[] array = {1, 1, 1, 2, 3, 3, 4};
Map<Integer, List<Integer>> result = 
        IntStream.range(0, array.length)
                 .boxed()
                 .collect(Collectors.groupingBy(i -> array[i], Collectors.toList()));

Update: If you don't have the array (and therefore the elements count), but a Stream<Integer>, you can collect the elements of the initial Stream into a List<Integer>. This way you will know the size of the Stream and then you can do:

Stream<Integer> = .... // The input stream goes here
//Collecting the input stream to a list, so that we get it's size.
List<Integer> list = stream.collect(Collectors.toList());
//Grouping process
Map<Integer, List<Integer>> result = 
    IntStream.range(0, list.size())
             .boxed()
             .collect(Collectors.groupingBy(i -> list.get(i), Collectors.toList()));

Upvotes: 5

Holger
Holger

Reputation: 298409

With a little helper method for collecting:

class MapAndIndex {
    Map<Integer,List<Integer>> map=new HashMap<>();
    int index;

    void add(int value) {
        map.computeIfAbsent(value, x->new ArrayList<>()).add(index++);
    }
    void merge(MapAndIndex other) {
        other.map.forEach((value,list) -> {
            List<Integer> l=map.computeIfAbsent(value, x->new ArrayList<>());
            for(int i: list) l.add(i+index);
        } );
        index+=other.index;
    }
}

the entire operation becomes:

Map<Integer,List<Integer>> map = IntStream.of(1, 1, 1, 2, 3, 3, 4)
    .parallel()
    .collect(MapAndIndex::new, MapAndIndex::add, MapAndIndex::merge).map;

When you need to track the indices which are unknown beforehand, you need mutable state and hence the operation called “mutable reduction”.

Note that you don’t need a ConcurrentMap here. The Stream implementation will already handle the concurrency. It will create one MapAndIndex container for each involved thread and invoke the merge operation on two containers once both associated threads are done with their work. This will also done in a way retaining the order, if the Stream has an order, like in this example (otherwise your task of recording indices makes no sense…).

Upvotes: 5

Related Questions