Reputation: 1032
I am trying to teach myself C# and have heard from a variety of sources that the functions get and setpixel can be horribly slow. What are some of the alternatives and is the performance improvement really that significant?
A chunk of my code for reference:
public static Bitmap Paint(Bitmap _b, Color f)
{
Bitmap b = new Bitmap(_b);
for (int x = 0; x < b.Width; x++)
{
for (int y = 0; y < b.Height; y++)
{
Color c = b.GetPixel(x, y);
b.SetPixel(x, y, Color.FromArgb(c.A, f.R, f.G, f.B));
}
}
return b;
}
Upvotes: 66
Views: 80382
Reputation: 474
This code should be parallelized, there is a massive performance gain being missed by running this synchronously. Almost no modern Microchip will have less than 4 threads available and some chips will have 40 threads available.
There is absolutely no reason to run that first loop synchronously. You can go through either the width or the length using many, many threads.
private void TakeApart_Fast(Bitmap processedBitmap)
{
BitmapData bitmapData = processedBitmap.LockBits(new Rectangle(0, 0, processedBitmap.Width, processedBitmap.Height), ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
ConcurrentBag<byte> points = new ConcurrentBag<byte>();
unsafe
{
int bytesPerPixel = System.Drawing.Bitmap.GetPixelFormatSize(processedBitmap.PixelFormat) / 8;
int heightInPixels = bitmapData.Height;
int widthInBytes = bitmapData.Width * bytesPerPixel;
_RedMin = byte.MaxValue;
_RedMax = byte.MinValue;
byte* PtrFirstPixel = (byte*)bitmapData.Scan0;
Parallel.For(0, heightInPixels, y =>
{
byte* currentLine = PtrFirstPixel + (y * bitmapData.Stride);
for (int x = 0; x < widthInBytes; x = x + bytesPerPixel)
{
// red
byte redPixel = currentLine[x + 2];
//save information with the concurrentbag
}
});
processedBitmap.UnlockBits(bitmapData);
}
}`
a benchmark wouldn't mean much because the answer to how much this will speed up the proccess depends 100% on what hardware you are using, and what else is running in the background, it all depends on how many free threads are available. If your running this on a 4000 series graphics card with thousands of streaming proccessors you may be able to do iterate through every column of the image at the same time.
if your running it with and old quad core you may only have 5 or 6 threads which is still incredibly significant.
Upvotes: 1
Reputation:
It's been some time, but I found an example that might be useful.
var btm = new Bitmap("image.png");
BitmapData btmDt = btm.LockBits(
new Rectangle(0, 0, btm.Width, btm.Height),
ImageLockMode.ReadWrite,
btm.PixelFormat
);
IntPtr pointer = btmDt.Scan0;
int size = Math.Abs(btmDt.Stride) * btm.Height;
byte[] pixels = new byte[size];
Marshal.Copy(pointer, pixels, 0, size);
for (int b = 0; b < pixels.Length; b++)
{
pixels[b] = 255; //Do something here
}
Marshal.Copy(pixels, 0, pointer, size);
btm.UnlockBits(btmDt);
Upvotes: 9
Reputation: 1980
public class DirectBitmap : IDisposable
{
public Bitmap Bitmap { get; private set; }
public Int32[] Bits { get; private set; }
public bool Disposed { get; private set; }
public int Height { get; private set; }
public int Width { get; private set; }
protected GCHandle BitsHandle { get; private set; }
public DirectBitmap(int width, int height)
{
Width = width;
Height = height;
Bits = new Int32[width * height];
BitsHandle = GCHandle.Alloc(Bits, GCHandleType.Pinned);
Bitmap = new Bitmap(width, height, width * 4, PixelFormat.Format32bppPArgb, BitsHandle.AddrOfPinnedObject());
}
public void SetPixel(int x, int y, Color colour)
{
int index = x + (y * Width);
int col = colour.ToArgb();
Bits[index] = col;
}
public Color GetPixel(int x, int y)
{
int index = x + (y * Width);
int col = Bits[index];
Color result = Color.FromArgb(col);
return result;
}
public void Dispose()
{
if (Disposed) return;
Disposed = true;
Bitmap.Dispose();
BitsHandle.Free();
}
}
There's no need for LockBits
or SetPixel
. Use the above class for direct access to bitmap data.
With this class, it is possible to set raw bitmap data as 32-bit data. Notice that it is PARGB, which is premultiplied alpha. See Alpha Compositing on Wikipedia for more information on how this works and examples on the MSDN article for BLENDFUNCTION to find out how to calculate the alpha properly.
If premultiplication might overcomplicate things, use PixelFormat.Format32bppArgb
instead. A performance hit occurs when it's drawn, because it's internally being converted to PixelFormat.Format32bppPArgb
. If the image doesn't have to change prior to being drawn, the work can be done before premultiplication, drawn to a PixelFormat.Format32bppArgb
buffer, and further used from there.
Access to standard Bitmap
members is exposed via the Bitmap
property. Bitmap data is directly accessed using the Bits
property.
byte
instead of int
for raw pixel dataChange both instances of Int32
to byte
, and then change this line:
Bits = new Int32[width * height];
To this:
Bits = new byte[width * height * 4];
When bytes are used, the format is Alpha/Red/Green/Blue in that order. Each pixel takes 4 bytes of data, one for each channel. The GetPixel and SetPixel functions will need to be reworked accordingly or removed.
IDisposable
just like Bitmap
.unsafe
block.Dispose
them when you're done so the memory can be unpinned.Graphics
objectBecause the Bitmap
property is actually a .NET Bitmap
object, it's straightforward to perform operations using the Graphics
class.
var dbm = new DirectBitmap(200, 200);
using (var g = Graphics.FromImage(dbm.Bitmap))
{
g.DrawRectangle(Pens.Black, new Rectangle(50, 50, 100, 100));
}
The question asks about performance, so here's a table that should show the relative performance between the three different methods proposed in the answers. This was done using a .NET Standard 2 based application and NUnit.
* Time to fill the entire bitmap with red pixels *
- Not including the time to create and dispose the bitmap
- Best out of 100 runs taken
- Lower is better
- Time is measured in Stopwatch ticks to emphasize magnitude rather than actual time elapsed
- Tests were performed on an Intel Core i7-4790 based workstation
Bitmap size
Method 4x4 16x16 64x64 256x256 1024x1024 4096x4096
DirectBitmap <1 2 28 668 8219 178639
LockBits 2 3 33 670 9612 197115
SetPixel 45 371 5920 97477 1563171 25811013
* Test details *
- LockBits test: Bitmap.LockBits is only called once and the benchmark
includes Bitmap.UnlockBits. It is expected that this
is the absolute best case, adding more lock/unlock calls
will increase the time required to complete the operation.
Upvotes: 141
Reputation: 458
You can use Bitmap.LockBits method. Also if you want to use parallel task execution, you can use the Parallel class in System.Threading.Tasks namespace. Following links have some samples and explanations.
Upvotes: 1
Reputation: 835
The reason bitmap operations are so slow in C# is due to locking and unlocking. Every operation will perform a lock on the required bits, manipulate the bits, and then unlock the bits.
You can vastly improve the speed by handling the operations yourself. See the following example.
using (var tile = new Bitmap(tilePart.Width, tilePart.Height))
{
try
{
BitmapData srcData = sourceImage.LockBits(tilePart, ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
BitmapData dstData = tile.LockBits(new Rectangle(0, 0, tile.Width, tile.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
unsafe
{
byte* dstPointer = (byte*)dstData.Scan0;
byte* srcPointer = (byte*)srcData.Scan0;
for (int i = 0; i < tilePart.Height; i++)
{
for (int j = 0; j < tilePart.Width; j++)
{
dstPointer[0] = srcPointer[0]; // Blue
dstPointer[1] = srcPointer[1]; // Green
dstPointer[2] = srcPointer[2]; // Red
dstPointer[3] = srcPointer[3]; // Alpha
srcPointer += BytesPerPixel;
dstPointer += BytesPerPixel;
}
srcPointer += srcStrideOffset + srcTileOffset;
dstPointer += dstStrideOffset;
}
}
tile.UnlockBits(dstData);
aSourceImage.UnlockBits(srcData);
tile.Save(path);
}
catch (InvalidOperationException e)
{
}
}
Upvotes: 20