Reputation: 95
In my java 8 spring boot application, I have a list of 40000 records. For each record, I have to call an external API and save the result to DB. How can I do this with better performance within no time? Each of the API calls will take about 20 secs to complete. I used a parallel stream for reducing the time but there was no considerable change in it.
if (!mainList.isEmpty()) {
AtomicInteger counter = new AtomicInteger();
List<List<PolicyAddressDto>> secondList =
new ArrayList<List<PolicyAddressDto>>(
mainList.stream()
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / subArraySize))
.values());
for (List<PolicyAddressDto> listOfList : secondList) {
listOfList.parallelStream()
.forEach(t -> {
callAtheniumData(t, listDomain1, listDomain2); // listDomain2 and listDomain1 declared
// globally
});
if (!listDomain1.isEmpty()) {
listDomain1Repository.saveAll(listDomain1);
}
if (!listDomain2.isEmpty()) {
listDomain2Repository.saveAll(listDomain2);
}
}
}
Upvotes: 0
Views: 1036
Reputation: 37506
Each of the API calls will take about 20 secs to complete.
Your external API is where you are being bottlenecked. There's really nothing your code can do to speed it up on the client side except to parallelize the process. You've already done that, so if the external API is within your organization, you need to look into any performance improvements there. If not, can do something like offload the processing via Kafka to Apache NiFi or Streamsets so that your Spring Boot API doesn't have to wait for hours to process the data.
Upvotes: 0
Reputation: 5198
That's because the parallel stream divide the task usually creating one thread per core -1. If every call you do to the external API takes 20 seconds and you have 4 core, this means 3 concurrent requests that wait for 20 seconds.
You can increase the concurrency of your calls in this way https://stackoverflow.com/a/21172732/574147 but I think you're just moving the problems.
An API that takes 20sec it's a really slow "typical" response time. If this is a really complex elaboration and CPU bounded, how can that service be able to respond at 10 concurrent request keeping the same performance? Probably it wouldn't.
Otherwise if the elaboration is "IO bounded" and takes 20 seconds, you probably need a service able to take (and work!) with list of elements
Upvotes: 0
Reputation: 371
Solving a problem in parallel always involves performing more actual work than doing it sequentially. Overhead is involved in splitting the work among several threads and joining or merging the results. Problems like converting short strings to lower-case are small enough that they are in danger of being swamped by the parallel splitting overhead.
As I can see the api call response is not being saved. Also all api calls are disjoint with respect to each other.
Can we try creating new threads for each api call.
for (List<PolicyAddressDto> listOfList : secondList) {
listOfList.parallelStream()
.forEach(t -> {
new Thread(() ->{callAtheniumData(t, listDomain1, listDomain2)}).start();
});
}
Upvotes: 1