Reputation: 5549
I have the following code in an object pool that implements the IEnumerable interface.
public IEnumerable<T> ActiveNodes
{
get
{
for (int i = 0; i < _pool.Count; i++)
{
if (_pool[i].AvailableInPool)
{
yield return _pool[i];
}
}
}
}
As far as I know (according to this question), this will generate garbage as the IEnumerable object will need to be collected. None of the elements in _pool will ever be collected, as the purpose of the pool is to keep references to all of them to prevent garbage creation.
Can anyone suggest a way to allow iteration over _pool so that no garbage is generated?
When iterating over pool, all of the items in pool that have AvailableInPool == true
should be iterated over. Order doesn't matter.
Upvotes: 37
Views: 15791
Reputation: 172606
Iterating items will in any 'normal' design usually result in the creation of a new enumerable object. Creating and disposing objects is very fast, so only in very special scenarios (where low latency is the top most priority) garbage collections could (I say 'could') be a problem.
A design without garbage is possible by returning structures that don't implement IEnumerable
. The C# compiler can still iterate such objects, because the foreach
statement uses duck typing. .NET's List<T>
, for instance, takes this approach.
When using foreach
over both an array and List<T>
, no garbage will be generated. When using foreach
on an array, C# will transform the operation to a for
statement, while List<T>
already implements a struct
enumerator, causing the foreach
to produce no garbage.
Here is a struct enumerable and struct enumerator. When you return the enumerable, the C# compiler can foreach over it:
public struct StructEnumerable<T>
{
private readonly List<T> pool;
public StructEnumerable(List<T> pool)
{
this.pool = pool;
}
public StructEnumerator<T> GetEnumerator()
{
return new StructEnumerator<T>(this.pool);
}
}
Here is the StructEnumerator
:
public struct StructEnumerator<T>
{
private readonly List<T> pool;
private int index;
public StructEnumerator(List<T> pool)
{
this.pool = pool;
this.index = 0;
}
public T Current
{
get
{
if (this.pool == null || this.index == 0)
throw new InvalidOperationException();
return this.pool[this.index - 1];
}
}
public bool MoveNext()
{
this.index++;
return this.pool != null && this.pool.Count >= this.index;
}
public void Reset()
{
this.index = 0;
}
}
You can simply return the StructEnumerable<T>
as follows:
public StructEnumerable<T> Items
{
get { return new StructEnumerable<T>(this.pool); }
}
And C# can iterate over this with a normal foreach:
foreach (var item in pool.Items)
{
Console.WriteLine(item);
}
Note that you can't LINQ over the item using System.Linq.Enumerable
You need the IEnumerable<T>
interface for that, and that involves creating enumerators and, therefore, garbage collection. You could, of course, build your own LINQ extension methods, but that will unlikely help, because that will often still result in new objects being created (when closures are being generated for used delegates).
UPDATE (2024): Newer .NET (Core) versions contain many optimizations to the LINQ methods that in some cases make it extraordinary fast and in some cases produce no garbage, especially when LINQ methods are used to filter array and List<T>
. This means that you should benchmark the performance of your code before reverting from LINQ operations to manual written code, because the LINQ operations might very well be orders of magnitude faster than anything you can reasonably write yourself.
Upvotes: 37
Reputation: 1711
Since XNA for XBox also works over the Compact Framework (and I suspect that's what you're working on given the hints you've given(1)), we can trust the XNA devs to teach us exactly when foreach creates garbage.
To quote the most relevant point (although the entire article's worth reading):
When doing a foreach over a
Collection<T>
an enumerator WILL be allocated.When doing a foreach over most other collections, including as arrays, lists, queues, linked lists, and others:
- if the collections are used explicitly, an enumerator will NOT be allocated.
- if they are cast to interfaces, an enumerator WILL be allocated.
So, if _pool is a List
, array or similar and can afford to, you can either return that type directly or cast the IEnumerable<T>
to the respective type to avoid garbage during the foreach.
As some additional reading, Shawn Hargreaves can have some useful additional information.
(1) 60 calls per second, Compact Framework, can't go down to native code, 1MB of allocation before a GC is triggered.
Upvotes: 6
Reputation: 61331
Also, you can have a pool of preallocated enumerators. Think about how many simultaneous enumerations you want to support.
The garbage collection overhead will go, at the expense of extra memory consumption. Speed vs. memory optimization dilemma in its purest form.
Upvotes: 0
Reputation: 659956
First off, a number of people are pushing back on Olhovsky to suggest that this is worrying about nothing. Avoiding collection pressure is actually very important in some applications on some environments.
The compact framework garbage collector has an unsophisticated policy; it triggers a collection every time 1000KB of memory has been allocated. Now suppose you are writing a game that runs on the compact framework, and the physics engine generates 1KB of garbage every time it runs. Physics engines are typically run on the order of 20 times a second. So that's 1200KB of pressure per minute, and hey, that's already more than one collection per minute just from the physics engine. If the collection causes a noticable stutter in the game then that might be unacceptable. In such a scenario, anything you can do to decrease collection pressure helps.
I am learning this myself the hard way, even though I work on the desktop CLR. We have scenarios in the compiler where we must avoid collection pressure, and we are jumping through all kinds of object pooling hoops to do so. Olhovsky, I feel your pain.
So, to come to your question, how can you iterate over the collection of pooled objects without creating collection pressure?
First, let's think about why collection pressure happens in the typical scenario. Suppose you have
foreach(var node in ActiveNodes) { ... }
Logically this allocates two objects. First, it allocates the enumerable -- the sequence -- that represents the sequence of nodes. Second, it allocates the enumerator -- the cursor -- that represents the current position in the sequence.
In practice sometimes you can cheat a bit and have one object that represents both the sequence and the enumerator, but you still have one object allocated.
How can we avoid this collection pressure? Three things come to mind.
1) Don't make an ActiveNodes method in the first place. Make the caller iterate over the pool by index, and check themselves whether the node is available. The sequence is then the pool, which is already allocated, and the cursor is an integer, neither of which are creating new collection pressure. The price you pay is duplicated code.
2) As Steven suggests, the compiler will take any types that have the right public methods and properties; they don't have to be IEnumerable and IEnumerator. You can make your own mutable-struct sequence and cursor objects, pass those around by value, and avoid collection pressure. It is dangerous to have mutable structs, but it is possible. Note that List<T>
uses this strategy for its enumerator; study its implementation for ideas.
3) Allocate the sequence and the enumerators on the heap normally and pool them too! You're already going with a pooling strategy, so there's no reason why you can't pool an enumerator as well. Enumerators even have a convenient "Reset" method that usually just throws an exception, but you could write a custom enumerator object that used it to reset the enumerator back to the beginning of the sequence when it goes back in the pool.
Most objects are only enumerated once at a time, so the pool can be small in typical cases.
(Now, of course you may have a chicken-and-egg problem here; how are you going to enumerate the pool of enumerators?)
Upvotes: 69
Reputation: 61331
Does it have to be IEnumerable? Will refactoring to an array with good old indexed acecss help?
Upvotes: 1
Reputation: 11477
You could implement your own IEnumerator class which will enumerate over these active nodes.
Now if you can guarantee that only one client will be enumerating over the active nodes at any one time, your class can cache this class so only a single instance exists. Then no garbage will need collecting. The call to ActiveNodes will call reset to start enumerating from the start.
This is a dangerous assumption to make, but if you are optimising you may be able to consider it
If you have multiple clients enumerating over these nodes at any time, then each will need its own instance of IEnumerator to be able to store their current cursor position in the collection. In which case these will need to be created and collected with each call - and you may as well stick with your original design.
Upvotes: 0
Reputation: 4084
Dont be afraid of those tiny garbage objects. The ActiveNodes in the pool will (should) be much more costly. Therefore, if you get rid of recreating them it should be sufficient.
@Edit: if you are made to use a managed platform and really want to archieve a zero-garbage state, disclaim the usage of the pool in a foreach loop and iterate over it in another manner, possibly utilizing an indexer. Or consider creating a list of potential nodes and return that instead.
@Edit2: Of course, implementing IEnumerator und using Current(), Next() and so on, would work as well.
Upvotes: 0