Reputation: 105
I have a discussion with colleague that we should not be using setters inside stream.map()
like the solution suggested here - https://stackoverflow.com/a/35377863/1552771
There is a comment to this answer that discourages using map
this way, but there hasn’t been a reason given as to why this is a bad idea. Can someone provide a possible scenario why this can break?
I have seen some discussions where people talk about concurrent modification of the collection itself, by adding or removing items from it, but are there any negatives to using map
to just set some values to a data object?
Upvotes: 7
Views: 6126
Reputation: 298183
Using side effects in map
like invoking a setter, has a lot of similarities to using peek
for non-debugging purposes, which have been discussed in In Java streams is peek really only for debugging?
This answer has a very good general advice:
Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.
Whereas the other answer names associated practical problems; I have to cite myself:
The important thing you have to understand, is that streams are driven by the terminal operation. The terminal operation determines whether all elements have to be processed or any at all.
When you place an operation with a side effect into a map
function, you have a specific expectation about on which elements it will be performed and perhaps even how it will be performed, e.g. in which order. Whether the expectation will be fulfilled, depends on other subsequent Stream operations and perhaps even on subtle implementation details.
To show some examples:
IntStream.range(0, 10) // outcome changes with Java 9
.mapToObj(i -> System.out.append("side effect on "+i+"\n"))
.count();
IntStream.range(0, 2) // outcome changes with Java 10 (or 8u222)
.flatMap(i -> IntStream.range(i * 5, (i+1) * 5 ))
.map(i -> { System.out.println("side effect on "+i); return i; })
.anyMatch(i -> i > 3);
IntStream.range(0, 10) // outcome may change with every run
.parallel()
.map(i -> { System.out.println("side effect on "+i); return i; })
.anyMatch(i -> i > 6);
Further, as already mentioned in the linked answer, even if you have a terminal operation that processes all elements and is ordered, there is no guaranty about the processing order (or concurrency for parallel streams) of intermediate operations.
The code may happen to do the desired thing when you have a stream with no duplicates and a terminal operation processing all elements and a map
function which is calling only a trivial setter, but the code has so many dependencies on subtle surrounding conditions that it will become a maintenance nightmare. Which brings us back to the first quote about using an API in an unintended way.
Upvotes: 6
Reputation: 1675
Streams are not just some new set of APIs which makes things easier for you. It also brings functional programming paradigm with it.
And, functional programming paradigm's most important aspect is to use pure functions for computations. A pure function is one where the output depends only and only on its input. So, basically Streams API should use stateless, side-effect-free and pure functions.
Quoting things from Joshua Bloch's Effective Java (3rd Edition)
If you’re new to streams, it can be difficult to get the hang of them. Merely expressing your computation as a stream pipeline can be hard. When you succeed, your program will run, but you may realize little if any benefit. Streams isn’t just an API, it’s a paradigm based on functional programming. In order to obtain the expressiveness, speed, and in some cases parallelizability that streams have to offer, you have to adopt the paradigm as well as the API. The most important part of the streams paradigm is to structure your compu- tation as a sequence of transformations where the result of each stage is as close as possible to a pure function of the result of the previous stage. A pure function is one whose result depends only on its input: it does not depend on any mutable state, nor does it update any state. In order to achieve this, any function objects that you pass into stream operations, both intermediate and terminal, should be free of side-effects. Occasionally, you may see streams code that looks like this snippet, which builds a frequency table of the words in a text file:
// Uses the streams API but not the paradigm--Don't do this!
Map<String, Long> freq = new HashMap<>();
try (Stream<String> words = new Scanner(file).tokens()) {
words.forEach(word -> { freq.merge(word.toLowerCase(), 1L, Long::sum);
});
}
What’s wrong with this code? After all, it uses streams, lambdas, and method references, and gets the right answer. Simply put, it’s not streams code at all; it’s iterative code masquerading as streams code. It derives no benefits from the streams API, and it’s (a bit) longer, harder to read, and less maintainable than the corresponding iterative code. The problem stems from the fact that this code is doing all its work in a terminal forEach operation, using a lambda that mutates external state (the frequency table). A forEach operation that does anything more than present the result of the computation performed by a stream is a “bad smell in code,” as is a lambda that mutates state. So how should this code look?
// Proper use of streams to initialize a frequency table
Map<String, Long> freq;
try (Stream<String> words = new Scanner(file).tokens()) {
freq = words
.collect(groupingBy(String::toLowerCase, counting()));
}
Upvotes: 3
Reputation: 7279
Just to name a few:
map()
with setter is interfering (it modifies the initial data), while specs require a non-interfering function. For more details read this post.map()
with setter is stateful (your logic may depend on initial value of field you're updating), while specs require a stateless functionmap
may mislead the future code maintainersUpvotes: 2
Reputation: 40034
I think the real issue here is that it is just bad practice and violates the intended use of the capability. For example, one can also accomplish the same thing with filter
. This perverts its use and also makes the code confusing or at best, unnecessarily verbose.
public static void main(String[] args) {
List<MyNumb> foo =
IntStream.range(1, 11).mapToObj(MyNumb::new).collect(
Collectors.toList());
System.out.println(foo);
foo = foo.stream().filter(i ->
{
i.value *= 10;
return true;
}).collect(Collectors.toList());
System.out.println(foo);
}
class MyNumb {
int value;
public MyNumb(int v) {
value = v;
}
public String toString() {
return Integer.toString(value);
}
}
So going back to the original example. One does not need to use map at all, resulting in the following rather ugly mess.
foos = foos.stream()
.filter(foo -> { boolean b = foo.isBlue();
if (b) {
foo.setTitle("Some value");
}
return b;})
.collect(Collectors.toList());
Upvotes: 2