Hunsu
Hunsu

Reputation: 3381

Processes vs threads in Java

In the questions I have read we suggest to use threads over processes because threads are faster. I decided go with threads for my program that edit articles in a category in Wikipedia. The program get the list of articles to edit and then divide the articles between 10 threads. By this I do 6-7 edits per minute and it's the same speed as if I haven't used threads. When I launch multiple instance of my program and give for each instance a category to process I see that each process can do 6-7 edits per minute (I tested that with 5 processes).

Why processes are much faster in my case? and why the threads haven't changed anything?

The code (Not the complete just to have an idea) :

 public static wiki = new Wiki();

 public process(){
      String[] articles = wiki.getArticles(category);

      for(int i=0; i< 10; i++){
            String[] part = getPart(articles, i, 10); 
            MyThread t = new MyThread(part);
            list.add(t);
      }
      ExecutorService.invokeAll(list); //I'm not sure about the syntax of the function
 }

public class MyThread extends Thread {
     public String[] articles ;

     public MyThread(String[] articles) {
         this.articles = articles;
     }

     public void run() {
         //some logic
         wiki.edit(...)
     }
} 

Upvotes: 5

Views: 6428

Answers (2)

Nitsan Wakart
Nitsan Wakart

Reputation: 2909

First of all you are using threads incorrectly. Threads are indeed Runnables, so you can submit them to an executor, but they will not be run as threads. The executor will be running the run() methods on it's own threads. The amount of parallel execution of you code above depends on the Executor you are using.

Secondly, 6-7 edits a second per thread sounds suspect, I would like to believe more is possible. You might be bottlenecking on a shared resource as @PeterLawrey is suggesting or you might be using blocking IO (or the library you use is using blocking IO) in which case you could increase your throughput by increasing the number of threads. It's hard to say what kind of bottleneck you are facing without some profiling data.

Upvotes: 1

Peter Lawrey
Peter Lawrey

Reputation: 533492

Each process has a number of threads to do it's work. If you have one process with N threads or N process with 1 thread, it makes little difference except.

  • threads are more light weight, and have slightly less overhead. The difference they makes is in the milli-seconds so you are unlikely to gain here.
  • using more processes, indirectly allows your program to use more memory (as each process has a limited heap size, you can change) If you are going to have N processes, a fair comparison is to limit the memory of each process to 1/Nth of the amount of memory.
  • what is more likely to be happening is that you are bottlenecking on a shared resource like a lock. This means you additional threads add little or no value as your program cannot use them efficiently. By using multiple processes, you break the connection between the threads.

I see that each process can do 6-7 edits per minute

Each edit taking 10 seconds sounds pretty long. Perhaps there is worth optimising your code with a CPU profiler to improve your performance.

Upvotes: 7

Related Questions