Urben
Urben

Reputation: 65

Custom switch-case as an alternative to foreach

For a game I am currently working on a particle system. Particles are objects with only position data. These particles have a method that update their position by using a vector field they are embedded into. The particles are inside an array

To update all particles in one physical step, I currently use a foreach:

foreach (particle p in ParList) p.update;

Where the update method is just doing two bitshifts and two additions to the original position. Now I wonder how many particles my system can handle and if I can optimize it to raise that number.

Looking into how foreach works, I found it is just a basic for loop, with doing a comparision and an addition to the index number.

I want to do what a for loop does without the check if the index number is >= 0 and without reducing the index number by one.

These two operations are not much usually, but in my case they take roughly 1/3 of the number of operations. So I wonder if I can do it this way:

switch (Parlist.Length)
{
    case 1024:
        ParList[1023].update;
        goto case 1023;
    case 1023:
        ParList[1022].update;
        goto case 1022;
    //and so on until
    case 1:
        ParList[0].update;
        break;
}

While it looks horrible, and I am aware this is not how it should be done, first tests make it look like I actually can rise performance quite much here. I would like to put this away into a class and access it in a more general manner like the foreach syntax is translated into a for loop. I would it to end up like this way:

eachcase (particle p in ParList)
{
    //instructions using p
}

which is translated into this:

switch (Parlist.Length)
{
    case 1024:
        //reference to instructions using Parlist[1023]
        goto case 1023;
    //and so on until
    case 1:
        //reference to instructions using Parlist[0]
        break;
}

How do I build such custom structures? Is this possible in C#?

If I am able to make this working, I also would like to implement a custom break condition like this:

eachcase (particle p in ParList, p.x == 0 && p.z == 0)
{
    //instructions using p
}

which is translated into this:

switch (Parlist.Length)
{
    case 1024:
        if (/*break condition*/) break;
        //reference to instructions using Parlist[1023]
        goto case 1023;
    //and so on until
    case 1:
        if (/*break condition*/) break;
        //reference to instructions using Parlist[0]
        break;
}

I read about Lambda expressions which can help me here but I am not sure how to connect it to a usual object array.

Upvotes: 0

Views: 666

Answers (3)

Benno Straub
Benno Straub

Reputation: 2557

What you are trying to do, is either manual loop unrolling or building a variation of Duff's device.

Both will probably, with modern compilers and CPU architectures, give you very little speedup, and possibly make your code slower on different CPUs.

They will, however, definitely make your code harder to read.

If you, after careful measuring, really need more performance here are some things you can try (ordered by perf. gain per amount of work):

  1. Mark your .update function/getter with [MethodImpl(MethodImplOptions.AggressiveInlining)]

    • This instructs the compiler to inline your function
    • The reason this might help is that it may help the compiler do #4 for you
  2. Make sure your Particle class is a value type (struct) and they are stored in an array or ArrayList

    • This will help the CPU with prefetching. Fetching a piece data from RAM ("cache miss") will take ~300 times as long as a floating point operation. Making sure the data is in the right order will allow the CPU to prefetch this data before it is needed and possibly provide significant speedups.
  3. SIMD (Somewhat Advanced)

    • This uses SIMD instructions which execute operations on 2-8 values at once, providing an almost equivalent (2x-8x) speedup.

    class Particles { // No Idea what your particles actually do, but it should be implementable like this
        Vector<float> positions;
        Vector<float> motion;
    
        Particles(unsigned n) { positions = new Vector(n * 3); motion = new Vector(n*3); }
    
        void update() {
            positions += motion; // Really, really, really efficient.
        }
    };
    
  4. Multithreading

  5. (Run your calculations on a GPU?)

Upvotes: 1

D Stanley
D Stanley

Reputation: 152521

Looking into how foreach works, I found it is just a basic for loop, with doing a comparison and an addition to the index number.

That's how it works for arrays because that's the most efficient way of navigating arrays, but that's not how it works in general. It is optimized by the compiler to use the index for arrays. Other collection types will have a GetEnumerator() method that will return some object that has methods to navigate through the collection. How it does that is completely defined by the enumerator's implementation. For List that enumeration is just incrementing the index and using the indexer (very similar to the array, but with the small overhead of the enumerator object).

I don't see how your "switch" statement is an improvement. All you're doing is hard-coding a loop into N statements. I find it hard to believe that the bounds checking is a significant amount of your program time. You have an enormous amount of duplicate code, the limit is finite (foreach loops could loop forever if the underlying collection is infinite), and it's much harder (IMHO) to understand the intent.

Now I wonder how many particles my system can handle

That's easy. Set it to the maximum size of the underlying collection type (~2 billion elements if Particle is a reference type, possibly less if it's a value type). If your system can handle that, then you're good to go. If not, then reduce the size until you find your limit.

and if I can optimize it to raise that number.

Well before you can optimize you have to figure out where the inefficiencies are. Find the operations that take up the most program time and work on those first.

I would not assume that the foreach has an efficiency problem. If you measure the performance of your program and determine that the foreach itself (the loop, not what each iteration is doing), then start looking for alternatives.

If your process is CPU bound (meaning that it pegs a single logical processor at 100% then you might benefit from parallelization. You might try things like PLinq or TPL and see if the benefit of parallelization outweighs the overhead it creates.

Upvotes: 1

PeterL
PeterL

Reputation: 48

Try Select(), it is more efficient than loops https://www.reddit.com/r/csharp/comments/4xb0d9/why_is_linqs_select_better_than_foreach/

ParList = ParList.Select(s => s.update).ToArray();

Upvotes: -1

Related Questions