Touchdown
Touchdown

Reputation: 504

Java - multithreading on very small image taking a long time

I've been playing around with some multithreaded image manipulation code that reads in an image and converts it to grayscale 2 ways - sequentially, and then in parallel, so I can compare the difference between the two.

One thing I did was make an absolutely tiny image, only 4 x 4px, of one solid colour. The sequential version usually runs in about 20 ms, and the (4-threaded) parallel version does that sometimes, but other times it seems to get "stuck" and take an unusually long time, sometimes up to 1.5 seconds. This doesn't seem to happen(?) with less than 4 threads, so I was just wondering what causes it to slow down so much? I have a few ideas, mainly that it could be that the overhead of setting up multiple threads for very small images just isn't worth it, but 1.5 seconds is a very long time to wait, more than it should be for any thread creation overhead.

Here is the source code:

PixelsManipulation.java (main class):

public final class PixelsManipulation{

private static Sequential sequentialGrayscaler = new Sequential();  

public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException {  

File file = new File("src/pixelsmanipulation/hiresimage.jpg");
FileInputStream fis = new FileInputStream(file);  
BufferedImage image = ImageIO.read(fis); //reading the image file  

int rows = 2; // 2 rows and 2 cols will split the image into quarters
int cols = 2;  
int chunks = rows * cols; // 4 chunks, one for each quarter of the image  
int chunkWidth = image.getWidth() / cols; // determines the chunk width and height  
int chunkHeight = image.getHeight() / rows;  
int count = 0;  
BufferedImage imgs[] = new BufferedImage[chunks]; // Array to hold image chunks  

for (int x = 0; x < rows; x++) {  
    for (int y = 0; y < cols; y++) {  
        //Initialize the image array with image chunks  
        imgs[count] = new BufferedImage(chunkWidth, chunkHeight, image.getType());  
        // draws the image chunk  

        Graphics2D gr = imgs[count++].createGraphics(); // Actually create an image for us to use
        gr.drawImage(image, 0, 0, chunkWidth, chunkHeight, chunkWidth * y, chunkHeight * x, chunkWidth * y + chunkWidth, chunkHeight * x + chunkHeight, null);  
        gr.dispose();

    }  
} 

//writing mini images into image files  
for (int i = 0; i < imgs.length; i++) {  
    ImageIO.write(imgs[i], "jpg", new File("img" + i + ".jpg"));  
}  
System.out.println("Mini images created");  

// Start threads with their respective quarters (chunks) of the image to work on
// I have a quad-core machine, so I can only use 4 threads on my CPU
Parallel parallelGrayscaler = new Parallel("thread-1", imgs[0]);
Parallel parallelGrayscaler2 = new Parallel("thread-2", imgs[1]);
Parallel parallelGrayscaler3 = new Parallel("thread-3", imgs[2]);
Parallel parallelGrayscaler4 = new Parallel("thread-4", imgs[3]);

// Sequential:
long startTime = System.currentTimeMillis();

sequentialGrayscaler.ConvertToGrayscale(image);

long stopTime = System.currentTimeMillis();
long elapsedTime = stopTime - startTime;
System.out.println("Sequential code executed in " + elapsedTime + " ms.");

// Multithreaded (parallel):
startTime = System.currentTimeMillis();

parallelGrayscaler.start();
parallelGrayscaler2.start();
parallelGrayscaler3.start();
parallelGrayscaler4.start();

// Main waits for threads to finish so that the program doesn't "end" (i.e. stop measuring time) before the threads finish
parallelGrayscaler.join();
parallelGrayscaler2.join();
parallelGrayscaler3.join();
parallelGrayscaler4.join();

stopTime = System.currentTimeMillis();
elapsedTime = stopTime - startTime;
System.out.println("Multithreaded (parallel) code executed in " + elapsedTime + " ms.");
}
}

Parallel.java:

// Let each of the 4 threads work on a different quarter of the image
public class Parallel extends Thread{//implements Runnable{

private String threadName;
private BufferedImage myImage; // Calling it "my" image because each thread will have its own unique quarter of the image to work on
private int width, height; // Image params

Parallel(String name, BufferedImage image){
threadName = name;
System.out.println("Creating "+ threadName);
myImage = image;
width = myImage.getWidth();
height = myImage.getHeight();

}

public void run(){
System.out.println("Running " + threadName);

// Pixel by pixel (for our quarter of the image)
for (int j = 0; j < height; j++){
    for (int i = 0; i < width; i++){

        // Traversing the image and converting the RGB values (doing the same thing as the sequential code but on a smaller scale)
        Color c = new Color(myImage.getRGB(i,j));

        int red = (int)(c.getRed() * 0.299);
        int green = (int)(c.getGreen() * 0.587);
        int blue  = (int)(c.getBlue() * 0.114);

        Color newColor = new Color(red + green + blue, red + green + blue, red + green + blue);

        myImage.setRGB(i,j,newColor.getRGB()); // Write the new value for that pixel


    }
}

File output = new File("src/pixelsmanipulation/"+threadName+"grayscale.jpg"); // Put it in a "lower level" folder so we can see it in the project view
try {
    ImageIO.write(newImage, "jpg", output);
} catch (IOException ex) {
    Logger.getLogger(Parallel.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("Thread " + threadName + " exiting. ---");
}
}

EDIT: Here is an example log from an execution:

Creating thread-1
Creating thread-2
Creating thread-3
Creating thread-4
Sequential code executed in 5 ms.
Running thread-2
Running thread-1
Running thread-3
Thread thread-1 exiting. ---
Thread thread-2 exiting. ---
Thread thread-3 exiting. ---
Running thread-4
Thread thread-4 exiting. ---
Multithreaded (parallel) code executed in 5 ms.

Weirdly I can't seem to replicate the delay, I'm now on a different machine to the one I originally worked on. Could it be a difference in the processor somehow (both are quad-core)? I'll try and get a log from the original machine.

EDIT 2: As Gee Bee said, it's most likely due to a combination of the fact that the slowness only seems to happen on a HDD as opposed to a SSD, due to the fact that I'm writing to file inside the threads, and that's generally slower on a HDD. Taking out the file writing code makes the threads run much faster, as well as simply running it on a SSD (although I guess writing to file inside threads isn't really optimal anyway and should be avoided).

Upvotes: 1

Views: 1504

Answers (1)

Gee Bee
Gee Bee

Reputation: 1794

the problem is quite tricky, and the 1.5 sec is very likely involves a locking issue.

After running your code:

  • sequential: 150ms
  • parallel: 57 ms (2x2, 4 threads)

Now each processing thread does a lot of things:

  • accesses the images at pixel level (this is quite resource intensive operation for various reasons)
  • writes file
  • performs jpeg compression

I suggest to isolate the file writing plus JPEG encoding from the actual processing and redo your measurements.

If you have 4 threads, now you experience 4 times JPEG encoding, and 4 times parallel file writes, which can make issues. I am on SSD, so the file writing makes no difference, but on a HDD it can make an impact.

Note that using more threads than physical cores does not make the parallel operations faster, but just add extra overhead. Also note that if your picture is too small, the 'parallel' threads not work in parallel. Rather the first thread is already completed while you're just tarting thread 3.

Although AWT imposes a lock on a bufferedimage: Are parallel drawing operations possible with Java Graphics2d? this does not affect your performance since you are using four different bufferedimages from four different threads.

So, your idea will work. However the performance improvement of 4 threads are too little if the calculation is fast. Try to not measure operations you have no control on (such as file io performance can be anything depending on your hardware and current virtual memory conditions).

Upvotes: 1

Related Questions