I have the following code in an object pool that implements the IEnumerable interface.
public IEnumerable<T> ActiveNodes
{
get
{
for (int i = 0; i < _pool.Count; i++)
{
if (_pool[i].AvailableInPool)
{
yield return _pool[i];
}
}
}
}
As far as I know (according to this question), this will generate garbage as the IEnumerable object will need to be collected. None of the elements in _pool will ever be collected, as the purpose of the pool is to keep references to all of them to prevent garbage creation.
Can anyone suggest a way to allow iteration over _pool so that no garbage is generated?
When iterating over pool, all of the items in pool that have AvailableInPool == true
should be iterated over. Order doesn't matter.
First off, a number of people are pushing back on Olhovsky to suggest that this is worrying about nothing. Avoiding collection pressure is actually very important in some applications on some environments.
The compact framework garbage collector has an unsophisticated policy; it triggers a collection every time 1000KB of memory has been allocated. Now suppose you are writing a game that runs on the compact framework, and the physics engine generates 1KB of garbage every time it runs. Physics engines are typically run on the order of 20 times a second. So that's 1200KB of pressure per minute, and hey, that's already more than one collection per minute just from the physics engine. If the collection causes a noticable stutter in the game then that might be unacceptable. In such a scenario, anything you can do to decrease collection pressure helps.
I am learning this myself the hard way, even though I work on the desktop CLR. We have scenarios in the compiler where we must avoid collection pressure, and we are jumping through all kinds of object pooling hoops to do so. Olhovsky, I feel your pain.
So, to come to your question, how can you iterate over the collection of pooled objects without creating collection pressure?
First, let's think about why collection pressure happens in the typical scenario. Suppose you have
foreach(var node in ActiveNodes) { ... }
Logically this allocates two objects. First, it allocates the enumerable -- the sequence -- that represents the sequence of nodes. Second, it allocates the enumerator -- the cursor -- that represents the current position in the sequence.
In practice sometimes you can cheat a bit and have one object that represents both the sequence and the enumerator, but you still have one object allocated.
How can we avoid this collection pressure? Three things come to mind.
1) Don't make an ActiveNodes method in the first place. Make the caller iterate over the pool by index, and check themselves whether the node is available. The sequence is then the pool, which is already allocated, and the cursor is an integer, neither of which are creating new collection pressure. The price you pay is duplicated code.
2) As Steven suggests, the compiler will take any types that have the right public methods and properties; they don't have to be IEnumerable and IEnumerator. You can make your own mutable-struct sequence and cursor objects, pass those around by value, and avoid collection pressure. It is dangerous to have mutable structs, but it is possible. Note that List<T>
uses this strategy for its enumerator; study its implementation for ideas.
3) Allocate the sequence and the enumerators on the heap normally and pool them too! You're already going with a pooling strategy, so there's no reason why you can't pool an enumerator as well. Enumerators even have a convenient "Reset" method that usually just throws an exception, but you could write a custom enumerator object that used it to reset the enumerator back to the beginning of the sequence when it goes back in the pool.
Most objects are only enumerated once at a time, so the pool can be small in typical cases.
(Now, of course you may have a chicken-and-egg problem here; how are you going to enumerate the pool of enumerators?)