Reputation: 1011
Recently on my job I had to process a bunch of xml files sequentially. In this case I wrote a sort of tree walk, so each xml file became an Iterator<SomeXmlElement>.
Later on the program didn't care about which SomeXmlElement object came from which file, so I wanted to concatenate all the Iterators into one.
This is roughly how the concat'ing was done (using String instead of SomeXmlElement):
Stream<String> s = Stream.empty();
for (int i = 0; i < 100000; i++) {
s = Stream.concat(s, Arrays.asList("1", "2", "3").stream());
}
s.findFirst().ifPresent(System.out::println);
It turns out that this never prints anything, it just hangs for a while and eventually you get heap error or stack overflow. So I tried again, this time using guava:
Iterable<String> s = Collections.emptyList();
for (int i = 0; i < 100000; i++) {
s = Iterables.concat(s, Arrays.asList("1", "2", "3"));
}
System.out.println(Iterables.getFirst(s, null));
Somewhat surprisingly, this also throws a StackOverflow. In the end I had to do the concat'ing manually, by implementing Iterator, and that finally worked as expected.
Why do the concatenate methods of these standard libraries fail when there's enough data? Streams and Iterables are designed to handle even infinite input, after all. Is there an easy alternative, apart from the "hard way" of implementing Iterator?
Upvotes: 3
Views: 1679
Reputation: 4507
Maybe you can extract a method as below:
private <T> Stream<T> flatten(final Collection<T> ... collections) {
return Stream.of(collections).map(Collection::stream).reduce(Stream::concat).get();
}
Returning concatenated stream would be good idea if you require further pipelining. Otherwise you could map and collect the result.
Upvotes: 0
Reputation: 28183
To concatenate a large number of streams use flatMap
. In your example, you would use it like this:
Stream<String> s = IntStream.range(0, 100000).boxed()
.flatMap(i -> Stream.of("1", "2", "3"));
For your actual problem, let's say that you have a method with a signature Stream<SomeXmlElement> parseFile(Path p)
and a Stream<Path> files
that comes from walking the tree.
Then you can obtain a Stream<SomeXmlElement>
:
Stream<SomeXmlElement> elements = files.flatMap(p -> parseFile(p));
Upvotes: 1