How to use multi-threading to make my application faster

Question

I am iterating through a List of Strings with +- 1500 entries. In each iteration I am again iterating through a List of Strings, but this time with +- 35 million entries. The result of the application is perfect. But it takes the application a long time (2+ hours) to give me the result. How should I structure multithreading to make my application faster?

The order of the result List is not important.

Should I divide the big List (35 million entries) into smaller blocks and iterator through them parallel? (How can I determine the perfect amount of blocks?)
Should I start a thread for each iteration in the small List? (This will create 1500 threads and I guess a lot of them will run "parallel")

What are my other options?

Representation of the code:

List result = new ArrayList();
for(Iterator i = data1.iterator();i.hasNext();){ //1500 entries
  String val = i.next();
  for(Iterator j = data2.iterator();j.hasNext();){ //35 million entries
    String test = j.next();
    if(val.equals(test)){
      result.add(val);
      break;
    }
  }
}
for(Iterator h = result.iterator();h.hasNext();){
  //write to file
}

UPDATE

After restructuring my code and implementing the answer given by JB Nizet my application now runs a lot faster. It now only takes 20 seconds to get to the same result! Without multi-threading!

JB Nizet · Accepted Answer

You could use a parallel stream:

List result = 
    data1.parallelStream()
         .filter(data2::contains)
         .collect(Collectors.toList());

But since you call contains() on data2 1500 times, and since contains() is O(N) for a list, transforming it to a HashSet first could make things much faster: contains() on HashSet is O(1). You might not even need multi-threading anymore:

Set data2Set = new HashSet<>(data2);
List result = 
    data.stream()
        .filter(data2Set::contains)
        .collect(Collectors.toList());

How to use multi-threading to make my application faster

Answers (2)

Related Questions