Serguei Fedorov
Serguei Fedorov

Reputation: 7923

Clear object variables after use (pooling) c#

I am working on a pooling algorithm which takes an allocated object and recycles it allocation for later. However, one of the issues of recycling an object is for instance:

 someObject obj = pool.alloc(); //gives me a new object if no previous allocations, If an allocation has been recycled, returns a previous allocation
 obj.someVariable = "foo";
 pool.recycle(obj);

The above code will take an existing allocation and save it so that I don't have to allocate any extra ram in case I was to have another someObject. However, the following creates an issue:

 someObject obj = pool.alloc(); //gives me the above allocation
 obj.otherVariable = "bar";
 obj.dump();

As a result, I will get the following result:

  someVariable = foo
  otherVariable = bar

The above approach creates an issue. If I for some reason (or someone else) have an algorithm that doesn't use certain variables inside of an object, the old values can cause unnecessary behavior. I have poked around a little bit to see if there is some way I can call the default constructor again (bad idea) and C# (thankfully) doesn't seem to allow you to do so. However, I was wondering if there was some way to use reflection to do this? Also, does clearing the variables in an object defeat the purpose of avoiding malloc (new)? In other words, if I spend the time clearing variables does the performance gain become minimal? I am trying to teach myself pooling so any criticism and advice is greatly appreciated!

Upvotes: 0

Views: 970

Answers (1)

Ted Spence
Ted Spence

Reputation: 2678

Here's one way of looking at pooling:

Pooling is only really necessary when you are reusing some complicated setup work that you can avoid by retrieving an already existing object. For example, one of the best known applications for object pooling is in the form of "Database Connections".

Pooling works for Database Connections (DBConns) because 1) a DBConn can be trivially identified by a connection string; and 2) a DBConn takes a lot of work and time to establish. The trivial identification is done by matching connection strings - if two connection strings are identical, the connections they establish should also be identical. Also, once you have a connection string, it can take hundreds of milliseconds to lookup the server address, open a socket to it, authenticate, and establish the connection. This means that pooling works well for database connections because when a connection is released, it can be reused the next time a connection is requested for the same connection string.

The .NET runtime environment is exceptionally good at allocating objects quickly and releasing them so you don't have memory issues. If all you're worried about is memory use or the speed of allocation, don't; you can't beat the compiler's performance by zeroing out memory on your own. However, if your objects have some complex, lengthy setup that can be avoided by retrieving an existing object from a pool, you can find some benefit.

Another good example of pooling is a particle system in a videogame. You have hundreds of particles that have to be created, go through a lifecycle, and be destroyed, only for new particles to be created after the old ones die. A typical particle system will create X number of objects as an array, and have a Reset() function that returns a dead object to life as a newly created particle in the original location. The reason this works is again because particles can be trivially identified, and the setup work (giving the graphics system the texture, putting locations, etc).

Does your application have both a "trivial match identification" capability and a lengthy setup process that can be avoided? If not, just allocate the objects new each time - I would wager you won't see any performance degradation.

EDIT: From a performance perspective, let's look at this here:

Resetting variables = O(N) assignment statements; using reflection can increase that significantly

Instantiate a new object = One malloc call, one constructor call; the complexity level is dependent on the constructor code, and memory fragmentation.

Using reflection to reset variables can work, but you have to know in advance that O(N) for assignments is faster than your malloc and constructor. For a database connection, I know that condition is satisfied; but is that true for your pool?

EDIT: From your comments below, you may indeed have found a case where pooling is appropriate. If that's the case, I would suggest the ideal approach is to create a Reset() function for any class that you are pooling. Try creating an IPoolable interface that defines a function Reset(). Then, for each class you pool, define the Reset() function so that it zeroes out all key variables. Because it compiles, it won't incur reflection overhead, and you can maintain object-specific optimizations that wouldn't be possible with dynamic code.

For the pool, define the pool class as MyPool<IPoolable>; then whenever an object is retrieved that was previously recycled, you can call Reset() on it before handing it back to the caller.

Upvotes: 1

Related Questions