Jan X Marek
Jan X Marek

Reputation: 2514

Finite generated Stream in Java - how to create one?

In Java, one can easily generate an infinite stream with Stream.generate(supplier). However, I would need to generate a stream that will eventually finish.

Imagine, for example, I want a stream of all files in a directory. The number of files can be huge, therefore I can not gather all the data upfront and create a stream from them (via collection.stream()). I need to generate the sequence piece by piece. But the stream will obviously finish at some point, and terminal operators like (collect() or findAny()) need to work on it, so Stream.generate(supplier) is not suitable here.

Is there any reasonable easy way to do this in Java, without implementing the entire Stream interface on my own?

I can think of a simple hack - doing it with infinite Stream.generate(supplier), and providing null or throwing an exception when all the actual values are taken. But it would break the standard stream operators, I could use it only with my own operators that are aware of this behaviour.

CLARIFICATION

People in the comments are proposing me takeWhile() operator. This is not what I meant. How to phrase the question better... I am not asking how to filter (or limit) an existing stream, I am asking how to create (generate) the stream - dynamically, without loading all the elements upfront, but the stream would have a finite size (unknown in advance).

SOLUTION

The code I was looking for is

    Iterator it = myCustomIteratorThatGeneratesTheSequence();
    StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, Spliterator.DISTINCT), false);

I just looked into java.nio.file.Files, how the list(path) method is implemented.

Upvotes: 55

Views: 13726

Answers (4)

Adrien H
Adrien H

Reputation: 803

While the author has discarded the takeWhile option, I find it adequate for certain use cases and worth an explanation.

The method takeWhile can be used on any stream and will terminate the stream when the predicate provided to the method returns false. The object which results in a false is not appended to the stream; only the objects which resulted in true are passed downstream.

So one method for generating a finite stream could be to use the Stream.generate method and return a value which signals the end of the stream by being evaluated to false by the predicate provided to takeWhile.

Here's an example, generating all the permutations of an array :

public static Stream<int[]> permutations(int[] original) {
    int dim = original.length;

    var permutation = original.clone();
    int[] controller = new int[dim];
    var low = new AtomicInteger(0);
    var up = new AtomicInteger(1);

    var permutationsStream = Stream.generate(() -> {
        while (up.get() < dim) {
            if (controller[up.get()] < up.get()) {
                low.set(up.get() % 2 * controller[up.get()]);

                var tmp = permutation[low.get()];
                permutation[low.get()] = permutation[up.get()];
                permutation[up.get()] = tmp;

                controller[up.get()]++;
                up.set(1);

                return permutation.clone();
            } else {
                controller[up.get()] = 0;
                up.incrementAndGet();
            }
        }

        return null;
    }).takeWhile(Objects::nonNull);

    return Stream.concat(
            Stream.ofNullable(original.clone()),
            permutationsStream
    );
}

In this example, I used the null value to signal the end of the stream. The caller of the method won't receive the null value !

OP could use a similar strategy, and combine it with a visitor pattern.

If it's a flat directory, OP would be better off using Stream.iterate with the seed being the index of the file to yield and Stream.limit on the number of files (which can be known without browsing the directory).

Upvotes: 2

tomk
tomk

Reputation: 31

Here is a stream which is custom and finite :

package org.tom.stream;
import java.util.*;
import java.util.function.*;
import java.util.stream.*;

public class GoldenStreams {
private static final String IDENTITY = "";

public static void main(String[] args) {
    Stream<String> stream = java.util.stream.StreamSupport.stream(new Spliterator<String>() {
        private static final int LIMIT = 25;
        private int integer = Integer.MAX_VALUE;
        {
            integer = 0;
        }
        @Override
        public int characteristics() {
            return Spliterator.DISTINCT;
        }
        @Override
        public long estimateSize() {
            return LIMIT-integer;
        }
        @Override
        public boolean tryAdvance(Consumer<? super String> arg0) {
            arg0.accept(IDENTITY+integer++);
            return integer < 25;
        }
        @Override
        public Spliterator<String> trySplit() {
            System.out.println("trySplit");
            return null;
        }}, false);
    List<String> peeks = new ArrayList<String>();
    List<String> reds = new ArrayList<String>();
    stream.peek(data->{
        peeks.add(data);
    }).filter(data-> {
        return Integer.parseInt(data)%2>0;
    }).peek(data ->{
        System.out.println("peekDeux:"+data);
    }).reduce(IDENTITY,(accumulation,input)->{
        reds.add(input);
        String concat = accumulation + ( accumulation.isEmpty() ? IDENTITY : ":") + input;
        System.out.println("reduce:"+concat);
        return concat;
    });
    System.out.println("Peeks:"+peeks.toString());
    System.out.println("Reduction:"+reds.toString());
}
}

Upvotes: 0

Alex
Alex

Reputation: 37

Kotlin code to create Stream of JsonNode from InputStream


    private fun InputStream.toJsonNodeStream(): Stream<JsonNode> {
        return StreamSupport.stream(
                Spliterators.spliteratorUnknownSize(this.toJsonNodeIterator(), Spliterator.ORDERED),
                false
        )
    }

    private fun InputStream.toJsonNodeIterator(): Iterator<JsonNode> {
        val jsonParser = objectMapper.factory.createParser(this)

        return object: Iterator<JsonNode> {

            override fun hasNext(): Boolean {
                var token = jsonParser.nextToken()
                while (token != null) {
                    if (token == JsonToken.START_OBJECT) {
                        return true
                    }
                    token = jsonParser.nextToken()
                }
                return false
            }

            override fun next(): JsonNode {
                return jsonParser.readValueAsTree()
            }
        }
    }

Upvotes: 0

the8472
the8472

Reputation: 43150

Is there any reasonable easy way to do this in Java, without implementing the entire Stream interface on my own?

A simple .limit() guarantees that it will terminate. But that's not always powerful enough.

After the Stream factory methods the simplest approach for creating customs stream sources without reimplementing the stream processing pipeline is subclassing java.util.Spliterators.AbstractSpliterator<T> and passing it to java.util.stream.StreamSupport.stream(Supplier<? extends Spliterator<T>>, int, boolean)

If you're intending to use parallel streams note that AbstractSpliterator only yields suboptimal splitting. If you have more control over your source fully implementing the Spliterator interface can better.

For example, the following snippet would create a Stream providing an infinite sequence 1,2,3...

in that particular example you could use IntStream.range()

But the stream will obviously finish at some point, and terminal operators like (collect() or findAny()) need to work on it.

short-circuiting operations like findAny() can actually finish on an infinite stream, as long as there is any element that matches.

Java 9 introduces Stream.iterate to generate finite streams for some simple cases.

Upvotes: 27

Related Questions