Reputation: 821
Is it ok to process and count processed data in such way?
long count = userDao.findApprovedWithoutData().parallelStream().filter(u -> {
Data d = dataDao.findInfoByEmail(u.getEmail());
boolean ret = false;
if (d != null) {
String result = "";
result += getFieldValue(d::getName, ". \n");
result += getFieldValue(d::getOrganization, ". \n");
result += getFieldValue(d::getAddress, ". \n");
if(!result.isEmpty()) {
u.setData(d.getInfo());
userDao.update(u);
ret = true;
}
}
return ret;
}).count();
So, in short: iterate over not complete records, update if data is present and count this number of records?
Upvotes: 3
Views: 1225
Reputation: 7649
It depends on your definition of process
. I cannot give you a clear yes or no
because, I think it is hard to conclude without understanding your code and how it is implemented.
You are using Parallel Stream and what happens there is Java runtime splits the Stream into sub-streams based on number of available threads in ForkJoinPool
's common pool.
When using parallelism you need to be careful for possible side effects:
Lambda expressions in stream operations should not interfere. Interference occurs when the source of a stream is modified while a pipeline processes the stream.
Avoid using stateful lambda expressions as parameters in stream operations. A stateful lambda expression is one whose result depends on any state that might change during the execution of a pipeline.
Looking at your question and applying the above points to it.
Non-interference > strongly states that Lambda expressions should not interfere with the source of stream (unless stream source is concurrent) during pipeline operation because it can cause:
With exception of well-behaved streams where the modification takes place during intermediate operation (i.e. filter), read more in here.
Your Lambda expression does interfere with the source of the stream, which is not advised but, the interference is within Intermediate operation and now everything comes down to whether the stream is well-behaved or not. So you might consider re-thinking your lambda expression when it comes to interference. It might also come down to how you update the source of the stream via userDao.udpate
, which is not clear from your question.
Stateful Lambda Expression > Your Lambda expression does not seem to be stateful and that is because the result of Lambda depends on value/s that do not change during the execution of the pipeline. So this does not apply to your case.
I advise you go through the documentation of Java 8 Stream as well as this blog which explains Java 8 Stream really well with examples.
Upvotes: 1
Reputation: 425003
IMHO this is bad code, because:
Predicates should not have side effects (just like getters shouldn't). It's unexpected, and that makes it bad.
Each execution of the predicate causes a large chain of queries to fire, which makes this code not scaleable.
Good code makes it obvious what is going on (unlike this code)
You should change the code to use a (fairly simple) single update query (that employs a join) and get the count from the "number of rows updated" info in the result from the persistence API.
Upvotes: 5