user697911
user697911

Reputation: 10531

My multi-threading code doesn't run faster

My computer has 8 cores and 64G memory. Is my way of using multi-threading right as below? It processes one line from each document (filePath) and store the result to a list of Document, and finally return.

The problem is that, I didn't see it's running faster when '4' threads are set. The time spent on my testing data is always the same as the time needed by a one-thread run. Any issue with my way of using Callable?

  List<Document> processDocs(String filePath)  {          
          ExecutorService pool = Executors.newFixedThreadPool(4);
          List<Document> processedDocs = new ArrayList<>();
            try {
                br = new BufferedReader(IOUtils.fileReader(filePath));

                String line = null;
                int docNo=0;
                while ((line = br.readLine()) != null) {
                    line = line.trim();

                    Callable<Document> callable = new NLPTextThread(line, ++docNo);
                    Future<Document> future = pool.submit(callable);
                    try {
                        processedDocs.add(future.get());
                    } catch (InterruptedException e) {
                        log.error("InterruptedException " + line);
                    } catch (ExecutionException e) {
                        log.error("ExecutionException: " + line);
                        e.printStackTrace();
                    }
                }
                pool.shutdown();

   return processedDocs;
}

Edited: One more question on the thread safety of the 'docNo' variable. I want to pass the doc serial number to the Callable. In this case, is it thread safe for the "docNo" variable?

Upvotes: 0

Views: 99

Answers (2)

Strelok
Strelok

Reputation: 51451

The fact that you're calling future.get() immediately after submitting your callable is making your code effectively single threaded as get blocks and you're not submitting any more tasks to the pool while future is resolved. Submit everything in one loop and store the futures. Then iterate the futures list to collect results.

This is better:

List<Document> processDocs(String filePath) {
    List<Callable<Document>> tasks = new ArrayList<>();
    try {
        BufferedReader br = new BufferedReader(IOUtils.fileReader(filePath));

        String line = null;
        while ((line = br.readLine()) != null) {
            tasks.add(new NLPTextThread(line.trim());

        }
        ExecutorService executor = Executors.newfixedThreadPool(4);

        return executor.invokeAll(tasks)
                .stream()
                .map(future -> {
                    try {
                        return future.get();
                    } catch (Exception e) {
                        throw new IllegalStateException(e);
                    }
                }).collect(Collectors.toList());
    }

PS. I thought I would also highlight the IO issue commeters raised on the original question.

If your NLPTextThread execution time for each line of file is negligible (compared to the time it takes to read this line from the file) I don't think you will see significant improvements to your run time using a thread pool as IO is effectively a bottleneck in this case as you're reading a single large file on the main thread (single threaded). You would likely see HIGHER performance gains if you split your input (if it is large) into multiple files and processing each file in parallel. Just some food for thought.

Upvotes: 7

David Ehrmann
David Ehrmann

Reputation: 7576

processedDocs.add(future.get());

That line waits for the worker to complete the job before reading another line from the file. You should save the futures to a list, then get them all when the reading is done.

More like

futures.add(future);
...
// while loop exits
for (Future<Document> future : futures) {
    try {
        processedDocs.add(future.get());
    } catch (InterruptedException e) {
    } catch (ExecutionException e) {
        e.printStackTrace();
    }
}

Upvotes: 2

Related Questions