Deslyxia
Deslyxia

Reputation: 629

ParallelStreams in java

I'm trying to use parallel streams to call an API endpoint to get some data back. I am using an ArrayList<String> and sending each String to a method that uses it in making a call to my API. I have setup parallel streams to call a method that will call the endpoint and marshall the data that comes back. The problem for me is that when viewing this in htop I see ALL the cores on the db server light up the second I hit this method ... then as the first group finish I see 1 or 2 cores light up. My issue here is that I think I am truly getting the result I want ... for the first set of calls only and then from monitoring it looks like the rest of the calls get made one at a time.

I think it may have something to do with the recursion but I'm not 100% sure.

private void generateObjectMap(Integer count){
    ArrayList<String> myList = getMyList();
    myList.parallelStream().forEach(f -> performApiRequest(f,count));
}

private void performApiRequest(String myString,Integer count){
    if(count < 10) {
        TreeMap<Integer,TreeMap<Date,MyObj>> tempMap = new TreeMap();
        try {
            tempMap = myJson.getTempMap(myRestClient.executeGet(myString);
        } catch(SocketTimeoutException e) {
            count += 1;
            performApiRequest(myString,count);
        }
        ...
    else {
        System.exit(1);
    }
}

Upvotes: 0

Views: 155

Answers (1)

sprinter
sprinter

Reputation: 27996

This seems an unusual use for parallel streams. In general the idea is that your are informing the JVM that the operations on the stream are truly independent and can run in any order in one thread or multiple. The results will subsequently be reduced or collected as part of the stream. The important point to remember here is that side effects are undefined (which is why variables changed in streams need to be final or effectively final) and you shouldn't be relying on how the JVM organises execution of the operations.

I can imagine the following being a reasonable usage:

list.parallelStream().map(item -> getDataUsingApi(item))
    .collect(Collectors.toList());

Where the api returns data which is then handed to downstream operations with no side effects.

So in conclusion if you want tight control over how the api calls are executed I would recommend you not use parallel streams for this. Traditional Thread instances, possibly with a ThreadPoolExecutor will serve you much better for this.

Upvotes: 1

Related Questions