Reputation: 1
I have a pretty extensive c++ image processing library and I have been working on a c# project but I cannot seem to be able to get the c# to be anywhere near as fast as the c++. My c++ takes 149ms to process setting the entire image white while the c# takes 1071ms to do the same.
Here is my c++ code
for (int i = 0; i < 100; i++)
{
for (int y = 0; y < image->height; y++)
{
unsigned char * srcPixel = (unsigned char*)image->mImageData + (y * image->width);
for (int x = 0; x < image->width; x++)
{
srcPixel[0] = 255;
srcPixel[1] = 255;
srcPixel[2] = 255;
srcPixel[3] = 255;
srcPixel += 4;
}
}
}
mImageData is a struct of unsigned chars
struct mImageData
{
unsigned char alpha;
unsigned char red;
unsigned char green;
unsigned char blue;
}
And this is the c# code I am using. This is the fastest I have been able to get this one.
frame = new Bitmap(3840, 2160);
BitmapData bitmapData12 = frame.LockBits(new Rectangle(0, 0,
frame.Width, frame.Height),
ImageLockMode.ReadWrite,
PixelFormat.Format32bppArgb);
var stopwatch = Stopwatch.StartNew();
unsafe
{
int pixelBytes = 4;
int paddingBytes = bitmapData12.Stride % pixelBytes;
byte* location1 = (byte*)bitmapData12.Scan0;
for (int i = 0; i < 100; i++)
{
location1 = (byte*)bitmapData12.Scan0;
for (int y = 0; y < bitmapData12.Height; ++y)
{
for (int x = 0; x < bitmapData12.Width; ++x)
{
location1[0] = 255;
location1[1] = 255;
location1[2] = 255;
location1[3] = 255;
location1 += pixelBytes;
}
location1 += paddingBytes;
}
}
}
stopwatch.Stop();
var miliseconds = stopwatch.ElapsedMilliseconds;
frame.UnlockBits(bitmapData12);
Upvotes: 0
Views: 1712
Reputation: 61512
I ran your code with ResW = 3000 and ResH = 3000, to get 900ms processing time. I ran it in Release mode, with the debugger detached.
Observe that this image contains 9 million pixels, each one 4 bytes long. That's 36 MB to fill. We're filling this 100 times, so a total of 3.6 billion bytes to set. My CPU runs at 4.5 GHz, so it managed to set 3.6 billion bytes in 4 billion clock cycles.
I'd say that's not too shabby for any language. If I were to shut down all the VMs, background processes and servers on my dev machine (which are currently consuming between 5% and 20% CPU) to run a cleaner measurement, I'd get pretty much exactly one byte set per clock cycle. Of course CPUs can do vastly better - if you ask them to perform the right operation. Setting one byte at a time certainly makes it slower.
So C# is really doing this as fast as possible without modifying the algorithm. It's just that C# refuses to optimize past a literal translation of the code, whereas C++ can and will do that. Simply doing what AdamF suggests (use uint
) already shrinks the time to 300ms in my own tests.
I don't think you've specified what your ResW/ResH are (or I'm blind), so it's still possible that you're not running the code the fastest way possible and something interferes with the measurement.
Upvotes: 1
Reputation: 2601
A little bit dirty solution but if the size of your image stucture is always the same then maybe you can try to optimize it in this way:
In my tests the speed up is from about 1100ms to 750ms
unsafe
{
int width = bitmapData12.Width;
for (int y = 0; y < bitmapData12.Height; ++y)
{
UInt32* location1 = (UInt32*) (bitmapData12.Scan0 + y*bitmapData12.Stride);
for (int x = 0; x < width; ++x)
location1[x] = UInt32.MaxValue;
}
}
Upvotes: 0