Reputation: 1032

C# - Faster Alternatives to SetPixel and GetPixel for Bitmaps for Windows Forms App

I am trying to teach myself C# and have heard from a variety of sources that the functions get and setpixel can be horribly slow. What are some of the alternatives and is the performance improvement really that significant?

A chunk of my code for reference:

public static Bitmap Paint(Bitmap _b, Color f)
{
  Bitmap b = new Bitmap(_b);
  for (int x = 0; x < b.Width; x++) 
  {
    for (int y = 0; y < b.Height; y++) 
    {
      Color c = b.GetPixel(x, y);
      b.SetPixel(x, y, Color.FromArgb(c.A, f.R, f.G, f.B));
    }
  }
  return b;
}

Upvotes: 66

Answers (5)

jdmneon

Reputation: 474

This code should be parallelized, there is a massive performance gain being missed by running this synchronously. Almost no modern Microchip will have less than 4 threads available and some chips will have 40 threads available.

There is absolutely no reason to run that first loop synchronously. You can go through either the width or the length using many, many threads.

        private void TakeApart_Fast(Bitmap processedBitmap)
        {
            
            BitmapData bitmapData = processedBitmap.LockBits(new Rectangle(0, 0, processedBitmap.Width, processedBitmap.Height), ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
            ConcurrentBag<byte> points = new ConcurrentBag<byte>();   
            unsafe
            {
                int bytesPerPixel = System.Drawing.Bitmap.GetPixelFormatSize(processedBitmap.PixelFormat) / 8;
                int heightInPixels = bitmapData.Height;
                int widthInBytes = bitmapData.Width * bytesPerPixel;
                _RedMin = byte.MaxValue;
                _RedMax = byte.MinValue;
                byte* PtrFirstPixel = (byte*)bitmapData.Scan0;
              
                Parallel.For(0, heightInPixels, y =>
                {
                    byte* currentLine = PtrFirstPixel + (y * bitmapData.Stride);
                    for (int x = 0; x < widthInBytes; x = x + bytesPerPixel)
                    {
                        // red
                        byte redPixel = currentLine[x + 2];
                        //save information with the concurrentbag
    
                    }
                });
                processedBitmap.UnlockBits(bitmapData);
            }
        }`

a benchmark wouldn't mean much because the answer to how much this will speed up the proccess depends 100% on what hardware you are using, and what else is running in the background, it all depends on how many free threads are available. If your running this on a 4000 series graphics card with thousands of streaming proccessors you may be able to do iterate through every column of the image at the same time.

if your running it with and old quad core you may only have 5 or 6 threads which is still incredibly significant.

Upvotes: 1

user9015994

Reputation:

It's been some time, but I found an example that might be useful.

var btm = new Bitmap("image.png");

BitmapData btmDt = btm.LockBits(
    new Rectangle(0, 0, btm.Width, btm.Height),
    ImageLockMode.ReadWrite,
    btm.PixelFormat
);
IntPtr pointer = btmDt.Scan0;
int size = Math.Abs(btmDt.Stride) * btm.Height;
byte[] pixels = new byte[size];
Marshal.Copy(pointer, pixels, 0, size);
for (int b = 0; b < pixels.Length; b++)
{
    pixels[b] = 255; //Do something here 
}

Marshal.Copy(pixels, 0, pointer, size);
btm.UnlockBits(btmDt);

Upvotes: 9

A.Konzel

Reputation: 1980

The immediately usable code

public class DirectBitmap : IDisposable
{
    public Bitmap Bitmap { get; private set; }
    public Int32[] Bits { get; private set; }
    public bool Disposed { get; private set; }
    public int Height { get; private set; }
    public int Width { get; private set; }

    protected GCHandle BitsHandle { get; private set; }

    public DirectBitmap(int width, int height)
    {
        Width = width;
        Height = height;
        Bits = new Int32[width * height];
        BitsHandle = GCHandle.Alloc(Bits, GCHandleType.Pinned);
        Bitmap = new Bitmap(width, height, width * 4, PixelFormat.Format32bppPArgb, BitsHandle.AddrOfPinnedObject());
    }

    public void SetPixel(int x, int y, Color colour)
    {
        int index = x + (y * Width);
        int col = colour.ToArgb();

        Bits[index] = col;
    }

    public Color GetPixel(int x, int y)
    {
        int index = x + (y * Width);
        int col = Bits[index];
        Color result = Color.FromArgb(col);

        return result;
    }

    public void Dispose()
    {
        if (Disposed) return;
        Disposed = true;
        Bitmap.Dispose();
        BitsHandle.Free();
    }
}

There's no need for LockBits or SetPixel. Use the above class for direct access to bitmap data.

With this class, it is possible to set raw bitmap data as 32-bit data. Notice that it is PARGB, which is premultiplied alpha. See Alpha Compositing on Wikipedia for more information on how this works and examples on the MSDN article for BLENDFUNCTION to find out how to calculate the alpha properly.

If premultiplication might overcomplicate things, use PixelFormat.Format32bppArgb instead. A performance hit occurs when it's drawn, because it's internally being converted to PixelFormat.Format32bppPArgb. If the image doesn't have to change prior to being drawn, the work can be done before premultiplication, drawn to a PixelFormat.Format32bppArgb buffer, and further used from there.

Access to standard Bitmap members is exposed via the Bitmap property. Bitmap data is directly accessed using the Bits property.

Using `byte` instead of `int` for raw pixel data

Change both instances of Int32 to byte, and then change this line:

Bits = new Int32[width * height];

To this:

Bits = new byte[width * height * 4];

When bytes are used, the format is Alpha/Red/Green/Blue in that order. Each pixel takes 4 bytes of data, one for each channel. The GetPixel and SetPixel functions will need to be reworked accordingly or removed.

Benefits to using the above class

Memory allocation for merely manipulating the data is unnecessary; changes made to the raw data are immediately applied to the bitmap.
There are no additional objects to manage. This implements IDisposable just like Bitmap.
It does not require an unsafe block.

Considerations

Pinned memory cannot be moved. It's a required side effect in order for this kind of memory access to work. This reduces the efficiency of the garbage collector (MSDN Article). Do it only with bitmaps where performance is required, and be sure to Dispose them when you're done so the memory can be unpinned.

Access via the `Graphics` object

Because the Bitmap property is actually a .NET Bitmap object, it's straightforward to perform operations using the Graphics class.

var dbm = new DirectBitmap(200, 200);
using (var g = Graphics.FromImage(dbm.Bitmap))
{
    g.DrawRectangle(Pens.Black, new Rectangle(50, 50, 100, 100));
}

Performance comparison

The question asks about performance, so here's a table that should show the relative performance between the three different methods proposed in the answers. This was done using a .NET Standard 2 based application and NUnit.

* Time to fill the entire bitmap with red pixels *
- Not including the time to create and dispose the bitmap
- Best out of 100 runs taken
- Lower is better
- Time is measured in Stopwatch ticks to emphasize magnitude rather than actual time elapsed
- Tests were performed on an Intel Core i7-4790 based workstation

              Bitmap size
Method        4x4   16x16   64x64   256x256   1024x1024   4096x4096
DirectBitmap  <1    2       28      668       8219        178639
LockBits      2     3       33      670       9612        197115
SetPixel      45    371     5920    97477     1563171     25811013

* Test details *

- LockBits test: Bitmap.LockBits is only called once and the benchmark
                 includes Bitmap.UnlockBits. It is expected that this
                 is the absolute best case, adding more lock/unlock calls
                 will increase the time required to complete the operation.

Upvotes: 141

turgay

Reputation: 458

You can use Bitmap.LockBits method. Also if you want to use parallel task execution, you can use the Parallel class in System.Threading.Tasks namespace. Following links have some samples and explanations.

Upvotes: 1

Bort

Reputation: 835

The reason bitmap operations are so slow in C# is due to locking and unlocking. Every operation will perform a lock on the required bits, manipulate the bits, and then unlock the bits.

You can vastly improve the speed by handling the operations yourself. See the following example.

using (var tile = new Bitmap(tilePart.Width, tilePart.Height))
{
  try
  {
      BitmapData srcData = sourceImage.LockBits(tilePart, ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
      BitmapData dstData = tile.LockBits(new Rectangle(0, 0, tile.Width, tile.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);

      unsafe
      {
          byte* dstPointer = (byte*)dstData.Scan0;
          byte* srcPointer = (byte*)srcData.Scan0;

          for (int i = 0; i < tilePart.Height; i++)
          {
              for (int j = 0; j < tilePart.Width; j++)
              {
                  dstPointer[0] = srcPointer[0]; // Blue
                  dstPointer[1] = srcPointer[1]; // Green
                  dstPointer[2] = srcPointer[2]; // Red
                  dstPointer[3] = srcPointer[3]; // Alpha

                  srcPointer += BytesPerPixel;
                  dstPointer += BytesPerPixel;
              }
              srcPointer += srcStrideOffset + srcTileOffset;
              dstPointer += dstStrideOffset;
          }
      }

      tile.UnlockBits(dstData);
      aSourceImage.UnlockBits(srcData);

      tile.Save(path);
  }
  catch (InvalidOperationException e)
  {

  }
}

Upvotes: 20