Matrix3x2 Performance

Question

In my graphics application, I can represent matrices using either SharpDX.Matrix3x2 or System.Numerics.Matrix3x2. However, upon running both matrices through a performance test, I found that SharpDX's matrices handily defeat System.Numerics.Matrix3x2 by a margin of up to 70% in terms of time. My test was a pretty simple repeated multiplication, here's the code:

 var times1 = new List();

for (var i = 0; i < 100; i++)
{
    var sw = Stopwatch.StartNew();

    var mat = SharpDX.Matrix3x2.Identity;

    for (var j = 0; j < 10000; j++)
        mat *= SharpDX.Matrix3x2.Rotation(13);

    sw.Stop();

    times1.Add(sw.ElapsedTicks);
}

var times2 = new List();

for (var i = 0; i < 100; i++)
{
    var sw = Stopwatch.StartNew();

    var mat = System.Numerics.Matrix3x2.Identity;

    for (var j = 0; j < 10000; j++)
        mat *= System.Numerics.Matrix3x2.CreateRotation(13);

    sw.Stop();

    times2.Add(sw.ElapsedTicks);
}

TestContext.WriteLine($"SharpDX: {times1.Average()}
System.Numerics: {times2.Average()}");

I ran these tests on an Intel i5-6200U processor.

Now, my question is, how can SharpDX's matrices possibly be faster? Isn't System.Numerics.Matrix3x2 supposed to utilise SIMD instructions to execute faster?

The implementation of SharpDX.Matrix3x2 is available here, and as you can see, it's written in plain C#.

laptou · Accepted Answer

It turns out that my testing logic was flawed - I was creating the rotation matrix inside the loop, which meant that I was testing the creation of rotation matrices and multiplication. I revised my testing code to look like this:

var times1 = new List();

for (var i = 0; i < 100; i++)
{
    var sw = Stopwatch.StartNew();

    var mat = SharpDX.Matrix3x2.Identity;

    var s = SharpDX.Matrix3x2.Scaling(13);
    var r = SharpDX.Matrix3x2.Rotation(13);
    var t = SharpDX.Matrix3x2.Translation(13, 13);

    for (var j = 0; j < 10000; j++)
    {
        mat *= s;
        mat *= r;
        mat *= t;
    }

    sw.Stop();

    times1.Add(sw.ElapsedTicks);
}

var times2 = new List();

for (var i = 0; i < 100; i++)
{
    var sw = Stopwatch.StartNew();

    var mat = System.Numerics.Matrix3x2.Identity;

    var s = System.Numerics.Matrix3x2.CreateScale(13);
    var r = System.Numerics.Matrix3x2.CreateRotation(13);
    var t = System.Numerics.Matrix3x2.CreateTranslation(13, 13);

    for (var j = 0; j < 10000; j++)
    {
        mat *= s;
        mat *= r;
        mat *= t;
    }

    sw.Stop();

    times2.Add(sw.ElapsedTicks);
}

So that the only thing performed inside the loop was multiplication, and I began to receive results indicating better performance from System.Numerics.Matrix3x2.

Another point: I didn't pay attention to the fact that SIMD optimisations only take effect in 64-bit code. These are my test results before and after changing the platform to x64:

Platform Target | System.Numerics.Matrix3x2 | SharpDX.Matrix3x2
---------------------------------------------------------------
AnyCPU          | 168ms                     | 197ms
x64             | 1.40ms                    | 1.43ms

When I check Environment.Is64BitProcess under AnyCPU, it returns false - and the "Prefer 32-Bit" box in Visual Studio is greyed out, so I suspect that AnyCPU is just an alias for x86 in this case, which explains why the test is 2 orders of magnitude faster under x64.

Matrix3x2 Performance

Answers (2)

Related Questions