Reputation: 167
I have a for loop that loops about 1 billion times. There are many database queries and computations within each iteration. The simplified pseudo code looks like below:
for(int i=0, i<1000000000, i++){
query();
if(...){
compute();
}
}
If I can set up and run multiple threads in parallel, so each iterates millions of times, that would significantly reduce the time.
Without some kind of parallel processing, it would take months to finish. Is it possible to reduce the time by implementing threads in this situation? I'm aware of the new streams features in Java8 but upgrading to java8 is not an option for me.
If there's an easy-to-follow guide somewhere, that would be great too! Thanks in advance.
edit: here's more detailed code. I'm potentially checking the database multiple times for each insertion, and I have to process the data before doing so. Ideally I want multiple threads to share the workload.
for(int i = 1; i<=100000000; i++){
String pid = ns.findPId(i); //query
object g = findObject(pid) //query
if(g!=null){
if(g.getSomeProperty()!=null && g.getSomeProperty().matches(EL)){
int isMatch = checkMatch(pid); //query
if(isMatch == 0){
String sampleId = findSampleId(pid); //query
if(sampleId!=null){
Object temp = ns.findMoreProperties(sampleId); //query
if(temp!=null){
g.setSomeAttribute(temp.getSomeAttribute());
g.setSomeOtherProperty(temp.getSomeOtherProperty());
insertObject(g); //compute, encapsulate and insert into database table
}
}
}else{
//log
}
}
}
Upvotes: 0
Views: 803
Reputation: 4284
1) Evaluate and see if you need a ThreadPoolExecutor:
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
2) Write a Callable for the first part
public class FindObjectCallable implements Callable<Object> {
...
@Override
public Object call() throws Exception {
String pid = ns.findPId(i); //query
return findObject(pid) //query
}
}
3) Main code to do the following:
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
List<Future<Object>> futures = new ArrayList<Future<Object>>(0);
for(int i = 1; i<=100000000; i++) {
FindObjectCallable callable = new FindObjectCallable( ns, i );
Future<Object> result = executor.submit(callable);
futures.add(result);
}
for( Future<Object> future: futures )
{
// do a java 7 lambda equivalent for the g processing part
}
Upvotes: 1
Reputation: 11
Seems like what you need is something like the Parallel.For that exist in C#. This post adresses that issue with an example of someone who implements his own parallel.For in java: Parallel.For implemented with Java
I wouldn't use the example Dang Nguyen sugessted, because that is just spinning up alot of threads but because there is no locking, there is no thread-safety or proper concurrency. There is a pretty big change you would hit an exception thrown by the database when 2 threads would try to write to the same field in the database at the same time.
Even with a parallel for loop, you still have a chance of running into concurrency problems in the database i think, since 2 thread tasks run in parallel could still be about accessing the same database entity.
Upvotes: 0