Reputation: 401
I am experimenting with Java's Streams and trying to figure out what is possible as well as their strengths and weaknesses. Currently I am trying to implement the Sieve of Eratosthenes using a stream, but cannot seem to find a good way to loop through previously filtered values without storing them in a separate collection.
I am wanting to accomplish something like this:
IntStream myStream = IntStream.range(0,3);
myStream.filter(s -> {
System.out.print("[filtering "+s+"] ");
myStream.forEach(q -> System.out.print(q+", "));
System.out.println();
return true; //eventually respond to values observed on the line above
});
With a desired output of:
[filtering 0]
[filtering 1] 0,
[filtering 2] 0, 1,
[filtering 3] 0, 1, 2,
Note that while filtering each new value all previously filtered values are observed. This would allow an easy implementation of the Sieve of Eratosthenes because I could filter out all non-prime values and for each new value check for divisibility against all numbers that have previously passed the prime filter.
However, the above example gives me an error in NetBeans:
local variables referenced from a lambda expression must be final or effectively final
This appears to be because I am referencing myStream within a filter that is already acting on myStream. Is there any good way of working around this error (ie. making a final copy of the stream containing only the values that have been filtered so far), or is there a better approach to this sort of problem without using a separate collection to store values?
Upvotes: 7
Views: 2561
Reputation: 26828
It's debatable if a stream is the right tool here, but .filter()
definitely isn't. Filters are supposed to be stateless, so the idea shouldn't come up in the first place. Based on the example in your answer a collector might be a feasible solution.
List<Integer> primes = IntStream.range(2, UPPER_BOUND)
.collect(ArrayList::new,
(list, number) -> {
for(int j=0; j < list.size(); j++) {
int prime = list.get(j);
if(prime > Math.sqrt(number)) {
break;
}
if(number % prime == 0) {
return;
}
}
list.add(number);
},
List::addAll);
ArrayList::new
creates a new list which is then referenced by the consumer as list
. The consumer is called for every element in the stream with number
being the element.
List::addAll
would only be relevant for parallel streams which can't be used for this algorithm anyway.
Upvotes: 1
Reputation: 5756
I managed to create an infinite Stream
of prime numbers using the Sieve of Eratosthenes, but it actually does not use past values. Instead, it removes the multiples of a prime in the tail (in a lazy way, because the tail is infinite), like the original Sieve of Eratosthenes algorithm. For that, I used an Iterator
as auxiliary (because the Stream
can only be used once) and implemented a lazyConcat
for streams.
class StreamUtils {
public static IntStream fromIterator(PrimitiveIterator.OfInt it) {
return StreamSupport.intStream(
Spliterators.spliteratorUnknownSize(it, Spliterator.ORDERED), false);
}
public static IntStream lazyConcat(Supplier<IntStream> a, Supplier<IntStream> b) {
return StreamSupport.intStream(new Spliterator.OfInt() {
boolean beforeSplit = true;
Spliterator.OfInt spliterator;
@Override
public OfInt trySplit() {
return null;
}
@Override
public long estimateSize() {
return Long.MAX_VALUE;
}
@Override
public int characteristics() {
return Spliterator.ORDERED;
}
@Override
public boolean tryAdvance(IntConsumer action) {
boolean hasNext;
if (spliterator == null) {
spliterator = a.get().spliterator();
}
hasNext = spliterator.tryAdvance(action);
if (!hasNext && beforeSplit) {
beforeSplit = false;
spliterator = b.get().spliterator();
hasNext = spliterator.tryAdvance(action);
}
return hasNext;
}
}, false);
}
}
My Sieve of Eratosthenes stream looks like this:
class Primes {
public static IntStream stream() {
return sieve(IntStream.iterate(2, n -> n + 1));
}
private static IntStream sieve(IntStream s) {
PrimitiveIterator.OfInt it = s.iterator();
int head = it.nextInt();
IntStream tail = StreamUtils.fromIterator(it);
return StreamUtils.lazyConcat(
() -> IntStream.of(head),
() -> sieve(tail.filter(n -> n % head != 0)));
}
}
Then we can use it this way:
System.out.println(Primes.stream().limit(20).boxed().collect(Collectors.toList()));
Output:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
I think it was a good exercise, but it seems it is quite inefficient and not stack-friendly at all.
Upvotes: 3
Reputation: 401
Other answers have suggested that the approach I had been trying is not possible, and that a separate collection must be used.
To provide a more complete answer, I wanted to provide a valid approach to this problem using streams and compare it against a more traditional approach.
Listing primes using streams (using the Sieve of Eratosthenes):
List<Integer> primes = new ArrayList<Integer>();
IntStream.iterate(2, i -> i + 1)
.limit(UPPER_BOUND)
.filter(i -> {
for(int j=0; j<primes.size(); j++) {
int prime = primes.get(j);
if(prime > Math.sqrt(i)) {
break;
}
if(i % prime == 0) {
return false;
}
}
return true;
})
.forEach(primes::add);
Traditional, equivalent, approach without using streams:
List<Integer> primes = new ArrayList<Integer>();
for(int i=2; i < UPPER_BOUND; i++) {
boolean isPrime = true;
for(int j=0; j<primes.size(); j++) {
int prime = primes.get(j);
if(prime > Math.sqrt(i)) {
break;
}
if(i % prime == 0) {
isPrime = false;
break;
}
}
if(isPrime) {
primes.add(i);
}
}
Performance Comparison:
Some experimentation with each function consistently demonstrated that the traditional approach is actually faster than using streams in this case. The streams approach consistently took 1.5x longer to find all prime numbers under one million when compared to the traditional approach (average of 106ms and 70ms respectively on my machine).
This difference in performance could likely be easily made up if the stream's .parallel() function could allow easy parallelization of the problem. However, parallelization is not easy in this case because ArrayList is not thread-safe, and will quickly result in errors and/or inaccurate results.
Conclusion:
Assuming the other answers are correct, filtering already-filtered data within a filter on that same stream is not possible in Java.
Listing primes can be tackled using streams. However, pending a better solution than my own, it is currently better to stick with a traditional stream-less approach.
Upvotes: 0
Reputation: 393831
You can't process a Stream more than once, therefore calling myStream.forEach
inside the filter method is not possible.
You could create a new IntStream
inside the filter.
Note that you will have to add some terminal operation to the outer Stream pipeline in order for it to be processed :
IntStream myStream = IntStream.range(0,4);
myStream.filter(s -> {
System.out.print("[filtering "+s+"] ");
IntStream.range(0,s).forEach(q -> System.out.print(q+", "));
System.out.println();
return true; //eventually respond to values observed on the line above
}).forEach(i->{});
This produces :
[filtering 0]
[filtering 1] 0,
[filtering 2] 0, 1,
[filtering 3] 0, 1, 2,
Upvotes: 2