which approach will be better - multi-threading example

I would like to know if there is any difference between these two approaches of performing a task using multi-threading in Java 7:

Brief requirement - I have to process 200 records, simultaneously, and each record should be processed once only.

Approach 1:

public class Counter {

    private static int count = -1;
    private static final Object lock = new Object();

    public static int getCount() {
        synchronized (lock) {
            return ++count;
        }
    }
}



public class Task implements Runnable {

    @Override
    public void run() {
        int count = 0;
        while((count = Counter.getCount()) < 200) {
            System.out.println(Thread.currentThread().getName() + " " + count);
            // process record # count
        }
    }
}

public class Tester {

    public static void main(String[] args) throws InterruptedException {
        long start = System.currentTimeMillis();
        ExecutorService service = Executors.newFixedThreadPool(10);
        for (int i = 0; i < 10; i++) {
            service.execute(new Task());
        }
        service.shutdown();
        service.awaitTermination(5, TimeUnit.MINUTES);
        long end = System.currentTimeMillis();
        System.out.println(end - start);
    }
}

Approach 2:

public class Task2 implements Runnable {

    private int i = 0;

    public Task2(int i) {
        super();
        this.i = i;
    }

    @Override
    public void run() {
        System.out.println(Thread.currentThread().getName() + " " + i);
        // process record # i
    }
}

public class Tester2 {

    public static void main(String[] args) throws InterruptedException {
        long start = System.currentTimeMillis();
        ExecutorService service = Executors.newFixedThreadPool(10); 
        for (int i = 0; i < 200; i++) {
            service.execute(new Task2(i));
        }
        service.shutdown();
        service.awaitTermination(5, TimeUnit.MINUTES);
        long end = System.currentTimeMillis();
        System.out.println(end - start);
    }
}

Both runs fine and produces same output and takes almost same time. However 200 number taken in example code will be in billions in actual scenario. Please suggest which one will be better in terms of memory and CPU etc.

Thanks.

Edit - adding approach 3 based on Jimmy T.'s suggestion.

Approach 3:

public class Tester3 {

    public static void main(String[] args) throws InterruptedException {

        int processedRecords = 0;
        int totalRecords = 200;
        int recordsPerThread = 10;
        boolean continueProcess = true;

        ExecutorService service = Executors.newFixedThreadPool(10);

        while(continueProcess) {
            int startIndex = processedRecords;
            int endIndex = startIndex + recordsPerThread - 1;
            if (endIndex >= totalRecords - 1) {
                endIndex = totalRecords - 1;
                continueProcess = false;
            }
            processedRecords = processedRecords + recordsPerThread;
            service.submit(new Task3(startIndex, endIndex));
        }
        service.shutdown();
        service.awaitTermination(5, TimeUnit.MINUTES);
    }
}


public class Task3 implements Runnable {

    private int startIndex = 0;
    private int endIndex = 0;

    public Task3(int startIndex, int endIndex) {
        super();
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }

    @Override
    public void run() {
        System.out.println("processing records from " + startIndex + " to " + endIndex);
    }
}

Upvotes: 2

Answers (3)

maaartinus

Reputation: 46372

Brief requirement - I have to process 200 records, simultaneously, and each record should be processed once only.

However 200 number taken in example code will be in billions in actual scenario.

As always, the answer is "it depends". Unless your tasks are extremely fast, the overhead of either synchronization or object creation shouldn't matter. Note that an ExecutorService has to do some synchronization internally, too.

You could avoid the synchronization and simplify your code by using an AtomicInteger counter. But I'd most probably go for a Queue and a a couple of Runnables polling it as it's simple and gives you a lot of flexibility:

you could use a PriorityQueue if you wanted to prioritize some records
you could create your own queue if storing all records in memory would be too expensive (your queue could fetch them from files or DB or whatever)

You could pre-fill your queue or use a BlockingQueue filled by a producer. The content of the queue could be the record indexes or (probably better) the records themselves.

Every task polls the queue and terminates when it's empty.

public class Task3 implements Runnable {
    @Override public void run() {
        while (true) {
            final Integer i = queue.poll();
            if (i==null) break;    
            System.out.println(Thread.currentThread().getName() + " " + i);
            // process record # i
        }
    }
}

You create just a limited number of them to keep all your core's busy.

ExecutorService service = Executors.newFixedThreadPool(10); 
for (int i = 0; i < 10; i++) service.execute(new Task3());

Upvotes: 2

Jimmy T.

Reputation: 4190

The second approach is better because you don't synchronize all threads but I would recomment that you process more than one record per Task2 instance.

Upvotes: 1

Bruno Franco

Reputation: 2027

Well, all i can see is that in approach 1, there is one more Object to be handled (Conter) by the JVM, and inside the Counter there is a synchronized block, that aways cost more the thread scheduler, because it will have to handle the acess to the variable count:

 public static int getCount() {
        synchronized (lock) {
            return ++count;
        }
    }

So, it seems approach 2 is a little better if we talk about processing and performance.

Upvotes: 1

which approach will be better - multi-threading example

Answers (3)

Related Questions