Runtime discrepancy when using CompletableFuture on an IO bound task

Question

My understanding of the JVM multi-threading model is that when a thread executes an IO call, the thread is BLOCKED and put into a waiting queue by the JVM/OS until data is available.

I am trying to emulate this behavior in my code and running a benchmark with various thread sizes, using JMH and CompletableFuture.

However, the results are not what I expected. I was expecting a constant execution time (with thread/context switching overhead) irrespective of the number of threads (with memory limitations), since the tasks are IO bound and not CPU bound.

My cpu is a 4 core/ 8 thread laptop processor, and even with 1 or 2 threads, there is a discrepancy in the expected behavior.

I'm trying to read a 5MB file (separate file for each thread) in the async task. At the start of each iteration, I create a FixedThreadPool with the required number of threads.

@Benchmark
public void readAsyncIO(Blackhole blackhole) throws ExecutionException, InterruptedException {
    List> readers = new ArrayList<>();

    for (int i =0; i< threadSize; i++) {
         int finalI = i;
         readers.add(CompletableFuture.runAsync(() -> readFile(finalI), threadPool));
    }

    Object result =  CompletableFuture
                     .allOf(readers.toArray(new CompletableFuture[0]))
                     .get();
    blackhole.consume(result);
}

@Setup(Level.Iteration)
public void setup() throws IOException {
    threadPool = Executors.newFixedThreadPool(threadSize);
}

@TearDown(Level.Iteration)
public void tearDown() {
    threadPool.shutdownNow();
}

public byte[] readFile(int i)  {
    try {
        File file = new File(filePath + "/" + fileName + i);
        byte[] bytesRead = new byte[(int)file.length()];
        InputStream inputStream = new FileInputStream(file);
        inputStream.read(bytesRead);
        return bytesRead;
    } catch (Exception e) {
        throw new CompletionException(e);
    }
}

And the JMH config,

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3)
@Fork(value=1)
@Measurement(iterations = 3)
public class SimpleTest {

    @Param({ "1", "2", "4", "8", "16", "32", "50", "100" })
    public int threadSize;
    .....

}

Any idea on what I'm doing wrong ? Or are my assumptions incorrect ?

DuncG · Accepted Answer

It seems reasonable. With single thread you see that 1 file takes ~ 2ms to deal with, adding more threads would lead to longer average per thread because each read(bytesRead) on very large size is likely to do multiple disk reads so there may be opportunity for IO blocking and thread context switching, plus - depending on the disks - more seek times.

Runtime discrepancy when using CompletableFuture on an IO bound task

Answers (1)

Related Questions