Reputation: 469
I would like to know if there is any difference between these two approaches of performing a task using multi-threading in Java 7:
Brief requirement - I have to process 200 records, simultaneously, and each record should be processed once only.
Approach 1:
public class Counter {
private static int count = -1;
private static final Object lock = new Object();
public static int getCount() {
synchronized (lock) {
return ++count;
}
}
}
public class Task implements Runnable {
@Override
public void run() {
int count = 0;
while((count = Counter.getCount()) < 200) {
System.out.println(Thread.currentThread().getName() + " " + count);
// process record # count
}
}
}
public class Tester {
public static void main(String[] args) throws InterruptedException {
long start = System.currentTimeMillis();
ExecutorService service = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
service.execute(new Task());
}
service.shutdown();
service.awaitTermination(5, TimeUnit.MINUTES);
long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
Approach 2:
public class Task2 implements Runnable {
private int i = 0;
public Task2(int i) {
super();
this.i = i;
}
@Override
public void run() {
System.out.println(Thread.currentThread().getName() + " " + i);
// process record # i
}
}
public class Tester2 {
public static void main(String[] args) throws InterruptedException {
long start = System.currentTimeMillis();
ExecutorService service = Executors.newFixedThreadPool(10);
for (int i = 0; i < 200; i++) {
service.execute(new Task2(i));
}
service.shutdown();
service.awaitTermination(5, TimeUnit.MINUTES);
long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
Both runs fine and produces same output and takes almost same time. However 200 number taken in example code will be in billions in actual scenario. Please suggest which one will be better in terms of memory and CPU etc.
Thanks.
Edit - adding approach 3 based on Jimmy T.'s suggestion.
Approach 3:
public class Tester3 {
public static void main(String[] args) throws InterruptedException {
int processedRecords = 0;
int totalRecords = 200;
int recordsPerThread = 10;
boolean continueProcess = true;
ExecutorService service = Executors.newFixedThreadPool(10);
while(continueProcess) {
int startIndex = processedRecords;
int endIndex = startIndex + recordsPerThread - 1;
if (endIndex >= totalRecords - 1) {
endIndex = totalRecords - 1;
continueProcess = false;
}
processedRecords = processedRecords + recordsPerThread;
service.submit(new Task3(startIndex, endIndex));
}
service.shutdown();
service.awaitTermination(5, TimeUnit.MINUTES);
}
}
public class Task3 implements Runnable {
private int startIndex = 0;
private int endIndex = 0;
public Task3(int startIndex, int endIndex) {
super();
this.startIndex = startIndex;
this.endIndex = endIndex;
}
@Override
public void run() {
System.out.println("processing records from " + startIndex + " to " + endIndex);
}
}
Upvotes: 2
Views: 548
Reputation: 46372
Brief requirement - I have to process 200 records, simultaneously, and each record should be processed once only.
However 200 number taken in example code will be in billions in actual scenario.
As always, the answer is "it depends". Unless your tasks are extremely fast, the overhead of either synchronization or object creation shouldn't matter. Note that an ExecutorService
has to do some synchronization internally, too.
You could avoid the synchronization and simplify your code by using an AtomicInteger
counter. But I'd most probably go for a Queue
and a a couple of Runnable
s poll
ing it as it's simple and gives you a lot of flexibility:
PriorityQueue
if you wanted to prioritize some recordsYou could pre-fill your queue or use a BlockingQueue
filled by a producer. The content of the queue could be the record indexes or (probably better) the records themselves.
Every task polls the queue and terminates when it's empty.
public class Task3 implements Runnable {
@Override public void run() {
while (true) {
final Integer i = queue.poll();
if (i==null) break;
System.out.println(Thread.currentThread().getName() + " " + i);
// process record # i
}
}
}
You create just a limited number of them to keep all your core's busy.
ExecutorService service = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) service.execute(new Task3());
Upvotes: 2
Reputation: 4190
The second approach is better because you don't synchronize all threads but I would recomment that you process more than one record per Task2 instance.
Upvotes: 1
Reputation: 2027
Well, all i can see is that in approach 1, there is one more Object to be handled (Conter) by the JVM, and inside the Counter there is a synchronized block, that aways cost more the thread scheduler, because it will have to handle the acess to the variable count:
public static int getCount() {
synchronized (lock) {
return ++count;
}
}
So, it seems approach 2 is a little better if we talk about processing and performance.
Upvotes: 1