user622505
user622505

Reputation: 773

Parallel class does not provide any speed up

I'm trying to create a method which will filter all pixels below given grayscale threshold out (as in, all below will be black, all above will be white). The method works, but is not as fast as I feel it could be.

I decided to use the Parallel class but no matter what I set the MaxDegreeOfParallelism I don't get any speed benefits. I perform some other operations on the bitmap too, and the total time of the operations, no matter what MaxDegreeOfParallelism is is always around 170 ms. When debugging, the time needed to perform this filtering itself takes around 160 ms, so I think there would be a noticeable overall difference.

I'm using an i7 processor, 4 physical cores, 8 logical cores.

The code:

Color black = System.Drawing.Color.FromArgb(0, 0, 0);
Color white = System.Drawing.Color.FromArgb(255, 255, 255);

int lowerBound = (int)((float)lowerBoundPercent * 255.0 / 100.0);
int upperBound = (int)((float)upperBoundPercent * 255.0 / 100.0);

int[][] border = new int[8][];
for (int i=0;i<8;i++)
{
    border[i] = new int[] { i*height/8, (i+1)*height/8-1};
}

Parallel.For(0, 8, new ParallelOptions { MaxDegreeOfParallelism = 8 }, i =>
    {
        for (int k = 0; k < width; k++)
        {
            for (int j = border[i][0]; j <= border[i][1]; j++)
            {
                Color pixelColor;
                int grayscaleValue;
                pixelColor = color[k][j];
                grayscaleValue = (pixelColor.R + pixelColor.G + pixelColor.B) / 3;
                if (grayscaleValue >= lowerBound && grayscaleValue <= upperBound)
                    color[k][j] = white;
                else
                    color[k][j] = black;
            }
        }
    });

color[][] is a jagged array of System.Drawing.Color.

The question: is this normal? If not, what can I do to change it?

EDIT:

Pixel extraction:

Color[][] color;
color = new Color[bitmap.Width][];
for (int i = 0; i < bitmap.Width; i++)
{
    color[i] = new Color[bitmap.Height];
    for (int j = 0; j < bitmap.Height; j++)
    {
        color[i][j] = bitmap.GetOriginalPixel(i, j);
    }
}

Bitmap is an instance of my own class Bitmap:

public class Bitmap
{
    System.Drawing.Bitmap processed;
    //...
    public Color GetOriginalPixel(int x, int y) { return processed.GetPixel(x, y); }
    //...
}

Upvotes: 2

Views: 244

Answers (2)

user622505
user622505

Reputation: 773

Using LockBits I managed to cut the time from ~165 ms to ~55 ms per frame. Then I proceeded to do some more research and combined LockBits with pointer operations in an unsafe context and the Parallel.For loop. The resulting code:

Bitmap class:

public class Bitmap
{
    System.Drawing.Bitmap processed;
    public System.Drawing.Bitmap Processed { get { return processed; } set { processed = value; } }
    // ...
}    

The method:

int lowerBound = 3*(int)((float)lowerBoundPercent * 255.0 / 100.0);
int upperBound = 3*(int)((float)upperBoundPercent * 255.0 / 100.0);

System.Drawing.Bitmap bp = bitmap.Processed;

int width = bitmap.Width;
int height = bitmap.Height;

Rectangle rect = new Rectangle(0, 0, width, height);
System.Drawing.Imaging.BitmapData bpData = bp.LockBits(rect, System.Drawing.Imaging.ImageLockMode.ReadWrite, bp.PixelFormat);

unsafe
{
    byte* s0 = (byte*)bpData.Scan0.ToPointer();
    int stride = bpData.Stride;

    Parallel.For(0, height, y1 =>
    {
        int posY = y1 * stride;
        byte* cpp = s0 + posY;

        for (int x =0; x<width; x++)
        {
            int total = cpp[0] + cpp[1] + cpp[2];
            if (total >= lowerBound && total <= upperBound)
            {
                cpp[0] = 255;
                cpp[1] = 255;
                cpp[2] = 255;
                cpp[3] = 255;
            }
            else
            {
                cpp[0] = 0;
                cpp[1] = 0;
                cpp[2] = 0;
                cpp[3] = 255;
            }

            cpp += 4;
        }
    });
}

bp.UnlockBits(bpData);

With this kind of work division in the Parallel.For loop the code executes in 1-5 ms, which means approximately a 70x speed up!

I tried making the chunks for the loop 4x and 8x bigger and the time range is still 1-5ms, so I won't go into that. The loop is fast enough anyways.

Thank you very much for your answer, Scott, and thanks everyone for input in the comments.

Upvotes: 3

Scott Chamberlain
Scott Chamberlain

Reputation: 127563

To answer your main question about why your parallel method is not any faster, Parralel.For only starts out with one thread then adds more theads as it detects that more threads may be benifitial in speeding up the work to do, note that the parallel option is MaxDegreeOfParallelism not just DegreeOfParallelism. Quite simply there is just not enough iterations of the loop for it to spin up enough threads to be effective, you need to give each iteration less work to do.

Try giving the parallel operation more work to do by looping of the width instead of by 8 chunks of the height.

Color black = System.Drawing.Color.FromArgb(0, 0, 0);
Color white = System.Drawing.Color.FromArgb(255, 255, 255);

int lowerBound = (int)((float)lowerBoundPercent * 255.0 / 100.0) * 3;
int upperBound = (int)((float)upperBoundPercent * 255.0 / 100.0) * 3;

Parallel.For(0, width, k =>
    {
        for (int j = 0; j < height; j++)
        {
                Color pixelColor;
                int grayscaleValue;
                pixelColor = color[k][j];
                grayscaleValue = (pixelColor.R + pixelColor.G + pixelColor.B);
                if (grayscaleValue >= lowerBound && grayscaleValue <= upperBound)
                    color[k][j] = white;
                else
                    color[k][j] = black;
        }
    });

I would not do both width and height in parallel, you then will likely run in to the opposite problem of not giving each iteration enough work to do.

I highly recommend you go download and read Patterns for Parallel Programming, it goes in to this exact example when discussing how much work you should give a Parallel.For. Look at the "Very Small Loop Bodies" and "Too fine-grained, Too corse-grained" Anti-Patterns starting at the bottom of page 26 of the C# version to see the exact problems you are running in to.

Also I would look in to using LockBits for reading the pixel data in and out instead of GetPixel and SetPixel like we discussed in the comments.

Upvotes: 3

Related Questions