Jonas B
Jonas B

Reputation: 2391

Efficiency of the .NET garbage collector

OK here's the deal. There are some people who put their lives in the hands of .NET's garbage collector and some who simply wont trust it.

I am one of those who partially trusts it, as long as it's not extremely performance critical (I know I know.. performance critical + .net not the favored combination), in which case I prefer to manually dispose of my objects and resources.

What I am asking is if there are any facts as to how efficient or inefficient performance-wise the garbage collector really is?

Please don't share any personal opinions or likely-assumptions-based-on-experience, I want unbiased facts. I also don't want any pro/con discussions because it won't answer the question.

Thanks

Edit: To clarify, I'm basically saying: No matter what application we write, resource critical or not can we just forget about everything and let the GC handle it or can't we?

I'm trying to get an answer on in reality what the GC does and doesn't and where it might fail where manual memory management would success IF there are such scenarios. Does it have LIMITATIONS? I don't know how I could possibly explain my question further.

I don't have any issues with any application it's a theoretical question.

Upvotes: 12

Views: 1374

Answers (5)

J D
J D

Reputation: 48707

I can tell you about some of the problems I've had with .NET's garbage collector.

If you're running an app that uses the server GC (e.g. an ASP.NET app) then your latency will be truly awful, with pauses of around a second when none of your threads can make any progress at all. This is because the .NET 4 server GC is a stop-the-world GC. Apparently, .NET 4.5 will introduce Microsoft's first mostly-concurrent server GC.

I once wrote some instrumentation code to measure latencies in a concurrent system using built-in collections like ConcurrentBag and kept running out of memory in 32-bit due to massive heap fragmentation because the .NET GC doesn't defragment large objects. I had to replace the array-based data structures with purely functional data structures that are scattered into millions of tiny pieces in order to avoid having anything on the Large Object Heap (LOH) that was causing the fragmentation.

I have found bugs in the GC like this one that causes the GC to leak memory until all system memory is exhausted whereupon the heap is cleared out in one huge GC cycle that pauses not only all of your threads but even other processes (because the system has gone to swap) for up to several minutes!

Although there is a "low latency" setting in the latest .NET GC it actually just turns garbage collection off, so your program leaks memory until you get one massive GC pause. Microsoft seem to prefer workarounds like this that are tantamount to saying "write your own garbage collector if you want usable latency".

However, the .NET GC is generally very good and, when used carefully, it is possible to get good results from it. For example, I recently wrote a fault-tolerant server that achieves 114µs door-to-door latency on average with 95% of latencies below 0.5ms. That is impressively close to the state-of-the-art (see here and here) given that I wrote the entire platform in F# by myself in just a few months. The network actually contributed more to the latency than the .NET GC did.

Upvotes: 5

Thomas Pornin
Thomas Pornin

Reputation: 74522

You cannot always forget about memory allocation, regardless of whether you use a GC or not. What a good GC implementation buys you is that most of the time you can afford not to think about memory allocation. However there is no ultimate memory allocator. For something critical, you have to be aware of how memory is managed, and this implies knowing how things are done internally. This is true for GC and for manual heap allocation alike.

There are some GC which offer real-time guarantees. "Real-time" does not mean "fast", it means that the allocator response time can be bounded. This is the kind of guarantee that is needed for embedded systems such as those which drive electric commands in a plane. Strangely enough, it is easier to have real-time guarantees with garbage collectors than with manual allocators.

The GC in the current .NET implementations are not real-time; they are heuristically efficient and fast. Note that the same can be said about manual allocation with malloc() in C (or new in C++) so if you are after real-time guarantees you already need to use something special. If you do not, then I do not want you to design the embedded electronics for the cars and planes I use !

Upvotes: 2

Daniel Earwicker
Daniel Earwicker

Reputation: 116764

You do not need to worry about this.

The reason is that if you ever find an edge case where the GC is taking up a significant amount of time, you will then be able to deal with it by making spot optimisations. This won't be the end of the world - it will probably be pretty easy.

And you are unlikely to find such edge cases. It really performs amazingly well. If you've only experienced heap allocators in typical C and C++ implementations, the .NET GC is a completely different animal. I was so amazed by it I wrote this blog post to try and get the point across.

Upvotes: 3

David
David

Reputation: 2795

Any GC algorithm will favor certain activity (ie:optimization). You will have to test the GC against your usage pattern to see how efficient it is for you. Even if someone else studied particular behavior of the .net GC and produced "facts" and "numbers", your results could be wildly different.

I think the only reasonable answer to this question is anecdotal. Most people don't have a problem with GC efficiency, even in large-scale situations. It is considered at least as efficient or more efficient than the GC's of other managed languages. If you are still concerned, you probably should not be using a managed langauge.

Upvotes: 1

Remus Rusanu
Remus Rusanu

Reputation: 294487

Is efficient enough for most applications. But you don't have to live in fear of GC. On really hot systems, low latency requirements, you should program in a fashion that completely avoids it. I suggest you look at this Rapid Addition White Paper:

Although GC is performed quite rapidly, it does take time to perform, and thus garbage collection in your continuous operating mode can introduce both undesirable latency and variation in latency in those applications which are highly sensitive to delay. As an illustration, if you are processing 100,000 messages per second and each message uses a small temporary 2 character string, around 8 bytes (this a function of string encoding and the implementation of the string object) is allocated for each message. Thus you are creating almost 1MB of garbage per second. For a system which may need to deliver constant performance over a 16 hour period this means that you will have to clean up 16 hours x 60 minutes x 60 seconds x 1MB of memory approximately 56 GB of memory. The best you can expect from the garbage collector is that it will clean this up entirely in either Generation 0 or 1 collections and cause jitter, the worst is that it will cause a Generation 2 garbage collection with the associated larger latency spike.

But be warned, pulling off such tricks as avoiding GC impact is really hard. You really need to ponder whether you are at that point in your perf requirements where you need to consider the impact of GC.

Upvotes: 9

Related Questions