Mehraj Malik
Mehraj Malik

Reputation: 15854

Why garbage collector stops all the threads before reclaiming the memory

I was reading some performance related post, and then I encountered the sentence "Java's garbage collector stops all the threads before reclaiming the memory, which is also a performance issue". I tried to find it on Google, but I could not.

Could someone please share something so that I can be clear about this?

Upvotes: 6

Views: 3666

Answers (2)

Dimitar Dimitrov
Dimitar Dimitrov

Reputation: 16357

The short and not very informative answer is Because it's damn hard not to, so let's elaborate.

There are many built-in collectors in the HotSpot JVM (see https://blogs.oracle.com/jonthecollector/entry/our_collectors). The generational collectors have evolved significantly, but they still cannot achieve full concurrency, as we speak. They can concurrently mark which objects are dead and which are not, they can concurrently sweep the dead objects, but they still cannot concurrently compact the fragmented living objects without stopping your program (*).

This is mostly because it's really, really hard to ensure that you are not breaking someone's program by changing an object's heap location and updating all references to it, without stopping the world and doing all in one clean swoop. It's also hard to ensure that all the living objects that you are moving are not being changed under your nose.

Generational collectors can still run for significant amounts of time without stopping the world and doing the necessary work, but still their algorithms are delaying the inevitable, not guaranteeing fully concurrent GC. Notice how phrases like mostly concurrent (i.e. not always concurrent) are used when describing many GC algorithms.

There are also non-generational collectors like G1GC and they can show awesome results (to the point that G1GC will become the default collector in HotSpot), but they still cannot guarantee that there will be no stop-the-world pauses. Here the problem is again the concurrent compaction, but specifically for G1 this is somewhat less of a problem, because it can concurrently perform some region-based compaction.

To say that this is scraping the surface will be an embellishment - the area is gigantic and I'd recommend you to go over some of the accessible materials on the topic like Gil Tene's Understanding Garbage Collection for some theory behind it or Emad Benjamin's Virtualizing and Tuning Large Scale JVMs for some practical problems and solutions.

(*) This is not an entirely unsolvable problem, though. Azul's Zing JVM with its C4 garbage collector claims fully concurrent collection (there's a whitepaper about it, but you may find the details here more interesting). OpenJDK's Shenandoah project also shows very promising results. Still, as the8472 has explained, you pay some price in throughput and significant price in complexity. The G1GC team considered going for a fully concurrent algorithm, but decided that the benefits of a STW collector outweigh that in the context of G1GC.

Upvotes: 9

the8472
the8472

Reputation: 43052

In principle they don't have to. But writing and using a fully concurrent garbage collector is

  • incredibly complex
  • requires more breathing room to operate efficiently, which may not be acceptable on memory-constrained devices
  • incurs significant performance overhead. You end up trading throughput (CPU cycles spent in mutators vs. cpu cycles spent in collector threads) for improved (zeroish) pause times. This may be an acceptable deal on very large heaps and many-core machines that provide interactive services, but may not be acceptable on small devices or services that do batch processing.

Conversely an implementation using stop the world pauses is simpler and more efficient in terms of CPU utilization with the downside that there are pauses.

Additionally you have to consider that pause times can be fairly low on small heaps if we're using humans as yardsticks. So low-pause/pauseless collectors are generally only worth it if you either have a system with smaller tolerances than humans or larger heaps, something that has historically been rare and only become more common recently as computers kept growing.

To avoid diving into the details let's consider reference counting instead of mark-sweep-compact collectors. There fully concurrent reference counting incurs the overhead of atomic memory accesses and potentially more complex counting schemes with higher memory footprints.

Upvotes: 6

Related Questions