Reputation: 2699
I'm reading up about Java streams and discovering new things as I go along. One of the new things I found was the peek()
function. Almost everything I've read on peek says it should be used to debug your Streams.
What if I had a Stream where each Account has a username, password field and a login() and loggedIn() method.
I also have
Consumer<Account> login = account -> account.login();
and
Predicate<Account> loggedIn = account -> account.loggedIn();
Why would this be so bad?
List<Account> accounts; //assume it's been setup
List<Account> loggedInAccount =
accounts.stream()
.peek(login)
.filter(loggedIn)
.collect(Collectors.toList());
Now as far as I can tell this does exactly what it's intended to do. It;
What is the downside of doing something like this? Any reason I shouldn't proceed? Lastly, if not this solution then what?
The original version of this used the .filter() method as follows;
.filter(account -> {
account.login();
return account.loggedIn();
})
Upvotes: 232
Views: 117655
Reputation: 10816
I would say that peek
provides the ability to decentralize code that can mutate stream objects, or modify global state (based on them), instead of stuffing everything into a simple or composed function passed to a terminal method.
Now the question might be: should we mutate stream objects or change global state from within functions in functional style java programming?
If the answer to any of the the above 2 questions is yes (or: in some cases yes) then peek()
is definitely not only for debugging purposes, for the same reason that forEach()
isn't only for debugging purposes.
For me when choosing between forEach()
and peek()
, is choosing the following: Do I want pieces of code that mutate stream objects (or change global state) to be attached to a composable, or do I want them to attach directly to stream?
I think peek()
will better pair with java9 methods. e.g. takeWhile()
may need to decide when to stop iteration based on an already mutated object, so paring it with forEach()
would not have the same effect.
P.S. I have not mentioned map()
anywhere because in case we want to mutate objects (or global state), rather than generating new objects, it works exactly like peek()
.
Upvotes: 8
Reputation: 1118
To get rid of warnings, I use functor tee
, named after Unix' tee:
public static <T> Function<T,T> tee(Consumer<T> after) {
return arg -> {
f.accept(arg);
return arg;
};
}
You can replace:
.peek(f)
with
.map(tee(f))
Upvotes: 2
Reputation: 1529
It seems like a helper class is needed:
public static class OneBranchOnly<T> {
public Function<T, T> apply(Predicate<? super T> test,
Consumer<? super T> t) {
return o -> {
if (test.test(o)) t.accept(o);
return o;
};
}
}
then switch peek
with map
:
.map(new OneBranchOnly< Account >().apply(
account -> account.isTestAccount(),
account -> account.setName("Test Account"))
)
results: Collections of accounts that only test accounts got renamed (no reference gets maintained)
Upvotes: 0
Reputation: 23516
Despite the documentation note for .peek
saying the "method exists mainly to support debugging" I think it has general relevance. For one thing the documentation says "mainly", so leaves room for other use cases. It is not deprecated since years, and speculations about its removal are futile IMO.
I would say in a world where we still have to handle side-effectful methods it has a valid place and utility. There are many valid operations in streams that use side-effects. Many have been mentioned in other answers, I'll just add here to set a flag on a collection of objects, or register them with a registry, on objects which are then further processed in the stream. Not to mention creating log messages during stream processing.
I support the idea to have separate actions in separate stream operations, so I avoid pushing everything into a final .forEach
. I favor .peek
over an equivalent .map
with a lambda who's only purpose, besides calling the side-effect method, is to return the passed in argument. .peek
tells me that what goes in also goes out as soon as I encounter this operation, and I don't need to read a lambda to find out. In that sense it is succinct, expressive and improves readability of the code.
Having said that I agree with all the considerations when using .peek
, e.g. being aware of the effect of the terminal operation of the stream it is used in.
Upvotes: 8
Reputation: 942
A lot of answers made good points, and especially the (accepted) answer by Makoto describes the possible problems in quite some detail. But no one actually showed how it can go wrong:
[1]-> IntStream.range(1, 10).peek(System.out::println).count();
| $6 ==> 9
No output.
[2]-> IntStream.range(1, 10).filter(i -> i%2==0).peek(System.out::println).count();
| $9 ==> 4
Outputs numbers 2, 4, 6, 8.
[3]-> IntStream.range(1, 10).filter(i -> i > 0).peek(System.out::println).count();
| $12 ==> 9
Outputs numbers 1 to 9.
[4]-> IntStream.range(1, 10).map(i -> i * 2).peek(System.out::println).count();
| $16 ==> 9
No output.
[5]-> Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9).peek(System.out::println).count();
| $23 ==> 9
No output.
[6]-> Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9).stream().peek(System.out::println).count();
| $25 ==> 9
No output.
[7]-> IntStream.range(1, 10).filter(i -> true).peek(System.out::println).count();
| $30 ==> 9
Outputs numbers 1 to 9.
[1]-> List<Integer> list = new ArrayList<>();
| list ==> []
[2]-> Stream.of(1, 5, 2, 7, 3, 9, 8, 4, 6).sorted().peek(list::add).count();
| $7 ==> 9
[3]-> list
| list ==> []
(You get the idea.)
The examples were run in jshell (Java 15.0.2) and mimic the use case of converting data (replace System.out::println
by list::add
for example as also done in some answers) and returning how much data was added. The current observation is that any operation that could filter elements (such as filter or skip) seems to force handling of all remaining elements, but it need not stay that way.
Upvotes: 12
Reputation: 298153
The important thing you have to understand is that streams are driven by the terminal operation. The terminal operation determines whether all elements have to be processed or any at all. So collect
is an operation that processes each item, whereas findAny
may stop processing items once it encountered a matching element.
And count()
may not process any elements at all when it can determine the size of the stream without processing the items. Since this is an optimization not made in Java 8, but which will be in Java 9, there might be surprises when you switch to Java 9 and have code relying on count()
processing all items. This is also connected to other implementation-dependent details, e.g. even in Java 9, the reference implementation will not be able to predict the size of an infinite stream source combined with limit
while there is no fundamental limitation preventing such prediction.
Since peek
allows “performing the provided action on each element as elements are consumed from the resulting stream”, it does not mandate processing of elements but will perform the action depending on what the terminal operation needs. This implies that you have to use it with great care if you need a particular processing, e.g. want to apply an action on all elements. It works if the terminal operation is guaranteed to process all items, but even then, you must be sure that not the next developer changes the terminal operation (or you forget that subtle aspect).
Further, while streams guarantee to maintain the encounter order for a certain combination of operations even for parallel streams, these guarantees do not apply to peek
. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek
action may get invoked in an arbitrary order and concurrently.
So the most useful thing you can do with peek
is to find out whether a stream element has been processed which is exactly what the API documentation says:
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline
Upvotes: 172
Reputation: 3753
The functional solution is to make account object immutable. So account.login() must return a new account object. This will mean that the map operation can be used for login instead of peek.
Upvotes: 3
Reputation: 790
Although I agree with most answers above, I have one case in which using peek actually seems like the cleanest way to go.
Similar to your use case, suppose you want to filter only on active accounts and then perform a login on these accounts.
accounts.stream()
.filter(Account::isActive)
.peek(login)
.collect(Collectors.toList());
Peek is helpful to avoid the redundant call while not having to iterate the collection twice:
accounts.stream()
.filter(Account::isActive)
.map(account -> {
account.login();
return account;
})
.collect(Collectors.toList());
Upvotes: 7
Reputation: 355
Perhaps a rule of thumb should be that if you do use peek outside the "debug" scenario, you should only do so if you're sure of what the terminating and intermediate filtering conditions are. For example:
return list.stream().map(foo->foo.getBar())
.peek(bar->bar.publish("HELLO"))
.collect(Collectors.toList());
seems to be a valid case where you want, in one operation to transform all Foos to Bars and tell them all hello.
Seems more efficient and elegant than something like:
List<Bar> bars = list.stream().map(foo->foo.getBar()).collect(Collectors.toList());
bars.forEach(bar->bar.publish("HELLO"));
return bars;
and you don't end up iterating a collection twice.
Upvotes: 33
Reputation: 106430
The key takeaway from this:
Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.
There is no harm in breaking this out to multiple operations, as they are distinct operations. There is harm in using the API in an unclear and unintended way, which may have ramifications if this particular behavior is modified in future versions of Java.
Using forEach
on this operation would make it clear to the maintainer that there is an intended side effect on each element of accounts
, and that you are performing some operation that can mutate it.
It's also more conventional in the sense that peek
is an intermediate operation which doesn't operate on the entire collection until the terminal operation runs, but forEach
is indeed a terminal operation. This way, you can make strong arguments around the behavior and the flow of your code as opposed to asking questions about if peek
would behave the same as forEach
does in this context.
accounts.forEach(a -> a.login());
List<Account> loggedInAccounts = accounts.stream()
.filter(Account::loggedIn)
.collect(Collectors.toList());
Upvotes: 118