using Threading in for loop of ArrayList from a method

Question

I have a ArrayList [big one read from a file] And I want to read its contents with multithreading and process each string calling a method repeatedly and printing it to a file .I have given a working structure of what my the code looks like.. How ever I a not able to code for what I want without getting tangled in exceptions related to synchronizations of threads ... I am new to the concept of threading .. and want a efficient way to to this ..I have looked at other solutions related to threading and arraylists but it hasn't worked out for me .. any suggestions as to how to go about this is appreciated

 import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
public class threadingWithMathod {
    public static void main(String[] args) throws FileNotFoundException, UnsupportedEncodingException {
        ArrayList samples=readurls("path/to/sample.csv");
        PrintStream filewriter = new PrintStream(new File("path/to/result.csv"), "UTF-8");
        for (int i = 0; i < samples.size(); i++) {
            String string1 = samples.get(i);
            String string2 = samples.get(i+1);
            ///Need Info As to how process with Threading without clashing
            /// sampleProcessString need to be called repeatedly
            //sampleProcessString(filewriter,string) by 2-3 threads
        }
    }
    
    public static void sampleProcessString(PrintStream filewriter,String string) {
        filewriter.println(processedString(string));
    }
    private static Object processedString(String string) {
        //Intended to generate a new line by using a Sql query
        //This method will be using a connection to a mysql data base based on sample
        return string+"++> done something";
    }
    public static ArrayList readurls(String filename) {
        ArrayList aslink=new ArrayList();
        BufferedReader reader;
        try {
            reader = new BufferedReader(new FileReader( filename));
            String line = reader.readLine();
            while (line != null) {
                    aslink.add(line);   
                line = reader.readLine();
            }
            reader.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return aslink;
    }

}

tevemadar · Accepted Answer

Created some snippets where you can try dropping your actual processing code into.

My test data looks like this:

try (PrintWriter pw = new PrintWriter("testdata.txt")) {
    for (int i = 0; i < 1000000; i++)
        pw.println(i);
}

So a textfile with a million numbers in its lines.
My "task" was to create a file containing double the value of the same lines, disregarding their order:

pw.println(Integer.parseInt(line) * 2);

where line is a line from the input file, and pw is a PrintWriter for the output.
In actual code:

try (PrintWriter pw = new PrintWriter("testresult.txt");
        BufferedReader br = new BufferedReader(new FileReader("testdata.txt"))) {
    String line;
    while ((line = br.readLine()) != null)
        pw.println(Integer.parseInt(line) * 2);
}

This is something which can be written shorter, and perhaps a bit more readable with streaming:

try (PrintWriter pw = new PrintWriter("testresult.txt")) {
    Files.lines(Paths.get("testdata.txt")).forEach(
        line -> pw.println(Integer.parseInt(line) * 2));
}

The two snippets produce very similar execution times, around 1.6-1.7 seconds on my machine (measured with the "budget" approach, long start = System.currentTimeMillis(); before and System.out.println(System.currentTimeMillis() - start); after).

Then the stream can be parallelized with a single .parallel() inside:

try (PrintWriter pw = new PrintWriter("testresult.txt")) {
    Files.lines(Paths.get("testdata.txt")).parallel().forEach(
        line -> pw.println(Integer.parseInt(line) * 2));
}

This will produce mixed order results.
A side remark on println(int): it's not documented, but its actual implementation is thread-safe, however if you want to be absolutely "safe" and build on the documented features only, you should synchronize it yourself:

try (PrintWriter pw = new PrintWriter("testresult.txt")) {
    Files.lines(Paths.get("testdata.txt")).parallel().forEach(line -> {
        synchronized (pw) {
            pw.println(Integer.parseInt(line) * 2);
        }
    });
}

Both of them are actually slower than the sequential (2 and 2.2 seconds, the extra manual synchronization does matter), but of course it matters a lot that this processing step is very simple. So it's important to keep in mind that if file operations eat up the time in your case too, parallelism can not really help that.

And for comparison, a complete snippet using a thread pool:

ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
ExecutorCompletionService ecs = new ExecutorCompletionService(es);
int counter=0;
try (BufferedReader br = new BufferedReader(new FileReader("testdata.txt"))) {
    String line;
    while ((line = br.readLine()) != null) {
        final String current = line;
        ecs.submit(new Callable() {
            @Override
            public String call() throws Exception {
                return Integer.toString(Integer.parseInt(current)*2);
            }
        });
        counter++;
    }
}
try (PrintWriter pw = new PrintWriter("testresult.txt")) {
    while(counter>0) {
        pw.println(ecs.take().get());
        counter--;
    }
}
es.shutdown();

This one is the longest of them for sure, on the other hand it runs for 2 seconds, so comparable to the synchronized-less streaming sample, and it's "safe" without it as file operations all happen in the main thread (workers only calculate and stringify). Going for full-manual threads could make things even more verbose, but I don't feel motivated to write such code at the moment.

using Threading in for loop of ArrayList from a method

Answers (2)

Related Questions