Reputation: 488
I have a piece of code that looks like this:
public List<Restaurant> getAllRestaurants() {
List<Restaurant> restaurants = getRestaurants().subList(0, 7); // This takes 234 ms to execute on average.
// There are 7 items in the restaurants list
for (Restaurant restaurant : restaurants) {
PlacesAPIResponse response = callGooglePlacesAPI(restaurant); // A call to the Google API should take 520ms for a given restaurant
restaurant.setRating(response.getRating());
}
return restaurants;
}
If I do the above statements in a for-each loop as shown, I expect the total time of the method to be 234ms + (7*520)ms = 3874ms
, since statements are run sequentially. This is far too slow, so I'd like to parallelize the statements in the for-each loop so that I call the Google Places API concurrently for each restaurant in the list. In theory, the response time should be 234ms + max(API call for Restaurant 1, ..., API call for Restaurant 7) = 234ms + 520ms = 754ms
, since the calls to the Google API are being made in parallel.
According to this link (Java 8: Parallel FOR loop), I should be able to use a parallelStream()
to do the statements concurrently like this:
long startTime = System.currentTimeMillis();
restaurants.parallelStream().forEach(restaurant -> {
PlacesAPIResponse response = callGooglePlacesAPI(restaurant);
restaurant.setRating(response.getRating());
});
long endTime = System.currentTimeMillis();
System.out.println("Calling Google Places API took " + (endTime - startTime) + " milliseconds");
This seems to call the Google Places API for each restaurant in parallel, but now each call to the Google Places API seems to take an increasing amount of time. Here is the output of my timestamps:
getRestaurants() took 234 milliseconds
Took 335 milliseconds to call Google Places API for Restaurant 1
Took 337 milliseconds to call Google Places API for Restaurant 2
Took 671 milliseconds to call Google Places API for Restaurant 3
Took 742 milliseconds to call Google Places API for Restaurant 4
Took 1086 milliseconds to call Google Places API for Restaurant 5
Took 1116 milliseconds to call Google Places API for Restaurant 6
Took 1470 milliseconds to call Google Places API for Restaurant 7
Calling Google Places API took 1473 milliseconds
1734ms
is much larger than the 754ms
I expected. I have tried parallel streams as well as ExecutorService to call the Google Places API concurrently, but I can't seem to get the desired response time. Can anyone point me in the right direction? Thanks.
EDIT: Here is what I tried with ExecutorService, in accordance with this post (Is there a easy way to parallelize a foreach loop in java?):
startTime = System.currentTimeMillis();
ExecutorService exe = Executors.newFixedThreadPool(2); // 2 can be changed of course
for (Restaurant restaurant : restaurants) {
exe.submit(() -> {
PlacesAPIResponse response = callGooglePlacesAPI(restaurant); // A call to the Google API should take 520ms for a given restaurant
restaurant.setRating(response.getRating());
});
}
exe.shutdown();
try {
exe.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
endTime = System.currentTimeMillis();
System.out.println("Calling Google Places API took " + (endTime - startTime) + " milliseconds");
return restaurants;
Here is the output of my timestamps:
getRestaurants() took 234 milliseconds
Took 464 milliseconds to call Google Places API for Restaurant 1
Took 575 milliseconds to call Google Places API for Restaurant 2
Took 452 milliseconds to call Google Places API for Restaurant 3
Took 420 milliseconds to call Google Places API for Restaurant 4
Took 414 milliseconds to call Google Places API for Restaurant 5
Took 444 milliseconds to call Google Places API for Restaurant 6
Took 422 milliseconds to call Google Places API for Restaurant 7
Calling Google Places API took 1757 milliseconds
The response time of this method is still 234ms + 1757 ms
instead of 234ms + 575ms
and I don't understand why.
Upvotes: 0
Views: 1102
Reputation: 255
This is quite a while back but I guess the reason lies in your choice of thread pool size. A thread pool size of two means you can only execute two jobs in parallel. The remaining jobs are queued until the threads are freed. So the calculation for your execution of Google Places API will be something like max(464+452+414+422, 575+420+444) = max(1752, 1439) = 1752
which is close to the actual value .This is explained well here.
Upvotes: 1
Reputation: 154
I guess your bottleneck is the connection to the internet or the Google Places server, not your loop. The server recognizes the same IP-address and therefore queues your requests to protect itself against denial-of-service-attacks. That means your loop runs in parallel but the internet requests are stacked at the server, that's why each request increasingly takes more time until it is answered and returned. To circumvent this you need something like a bot net (sending each inquiry from different computers) or maybe Google Places will sell you a special connection for parallel requests.
Upvotes: 0
Reputation: 870
The best here is to use executorService and supply tasks for them as separated Runnable().
Or you may use Future here.
Upvotes: 1