Reputation: 7029
In my graphics application, I can represent matrices using either SharpDX.Matrix3x2
or System.Numerics.Matrix3x2
. However, upon running both matrices through a performance test, I found that SharpDX's matrices handily defeat System.Numerics.Matrix3x2
by a margin of up to 70% in terms of time. My test was a pretty simple repeated multiplication, here's the code:
var times1 = new List<float>();
for (var i = 0; i < 100; i++)
{
var sw = Stopwatch.StartNew();
var mat = SharpDX.Matrix3x2.Identity;
for (var j = 0; j < 10000; j++)
mat *= SharpDX.Matrix3x2.Rotation(13);
sw.Stop();
times1.Add(sw.ElapsedTicks);
}
var times2 = new List<float>();
for (var i = 0; i < 100; i++)
{
var sw = Stopwatch.StartNew();
var mat = System.Numerics.Matrix3x2.Identity;
for (var j = 0; j < 10000; j++)
mat *= System.Numerics.Matrix3x2.CreateRotation(13);
sw.Stop();
times2.Add(sw.ElapsedTicks);
}
TestContext.WriteLine($"SharpDX: {times1.Average()}\nSystem.Numerics: {times2.Average()}");
I ran these tests on an Intel i5-6200U processor.
Now, my question is, how can SharpDX's matrices possibly be faster? Isn't System.Numerics.Matrix3x2
supposed to utilise SIMD instructions to execute faster?
The implementation of SharpDX.Matrix3x2
is available here, and as you can see, it's written in plain C#.
Upvotes: 0
Views: 729
Reputation: 444
There are a few other things you need to consider also with the testing. These are just side notes, and wont affect your current results. I've done some testing like this also.
Some corresponding functions in Sharpdx pass by object, not reference, there are corresponding by reference functions you might want to play with. You've used the operators in your testing (all fine, its a comparable test!). Just in some situations, use of operators is slower than the by reference functions.
Upvotes: 0
Reputation: 7029
It turns out that my testing logic was flawed - I was creating the rotation matrix inside the loop, which meant that I was testing the creation of rotation matrices and multiplication. I revised my testing code to look like this:
var times1 = new List<float>();
for (var i = 0; i < 100; i++)
{
var sw = Stopwatch.StartNew();
var mat = SharpDX.Matrix3x2.Identity;
var s = SharpDX.Matrix3x2.Scaling(13);
var r = SharpDX.Matrix3x2.Rotation(13);
var t = SharpDX.Matrix3x2.Translation(13, 13);
for (var j = 0; j < 10000; j++)
{
mat *= s;
mat *= r;
mat *= t;
}
sw.Stop();
times1.Add(sw.ElapsedTicks);
}
var times2 = new List<float>();
for (var i = 0; i < 100; i++)
{
var sw = Stopwatch.StartNew();
var mat = System.Numerics.Matrix3x2.Identity;
var s = System.Numerics.Matrix3x2.CreateScale(13);
var r = System.Numerics.Matrix3x2.CreateRotation(13);
var t = System.Numerics.Matrix3x2.CreateTranslation(13, 13);
for (var j = 0; j < 10000; j++)
{
mat *= s;
mat *= r;
mat *= t;
}
sw.Stop();
times2.Add(sw.ElapsedTicks);
}
So that the only thing performed inside the loop was multiplication, and I began to receive results indicating better performance from System.Numerics.Matrix3x2
.
Another point: I didn't pay attention to the fact that SIMD optimisations only take effect in 64-bit code. These are my test results before and after changing the platform to x64:
Platform Target | System.Numerics.Matrix3x2 | SharpDX.Matrix3x2
---------------------------------------------------------------
AnyCPU | 168ms | 197ms
x64 | 1.40ms | 1.43ms
When I check Environment.Is64BitProcess
under AnyCPU, it returns false - and the "Prefer 32-Bit" box in Visual Studio is greyed out, so I suspect that AnyCPU is just an alias for x86 in this case, which explains why the test is 2 orders of magnitude faster under x64.
Upvotes: 1