Oleg Golovkov
Oleg Golovkov

Reputation: 333

dotnet-gcdump unexpected dump size and impact

I'm running a simple CRUD app built with ASP.NET Core and EF Core 3.1 in a docker swarm cluster on ubuntu. I'm only using managed code. The container has a 10GB memory limit specified. I can inspect a running container and verify that this limit is actually set, I also see that DOTNET_RUNNING_IN_CONTAINER is set to true. When the app is started the memory consumption is about 700MB and it slowly builds up. Once it reaches 7GB (I see it in container generic stats) I start getting OutOfMemoryExceptions and it stays at this level for days. So the first question is

  1. Why doesn't it go up to 10 GB?

Anyway I expect memory leaks so I have a dotnet-gcdump tool installed in this same container so I go ahead and collect the dump for future analysis with dotnet-gcdump collect. Once I execute this command I see the memory consumption of the running container drops from 7GB to 3GB and stays at this level. The resulting .gcdump file itself size though is only ~200MB with nothing suspicious in it. So next questions are

  1. How does the collection of a dump reduce memory consumption? I'd assume it's doing GC with LOH compaction but it doesn't mention it in the docs.
  2. Why isn't this memory freed automatically if the tool is able to do it?
  3. Why is a resulting dump only 200 MB in size?

Upvotes: 2

Views: 1044

Answers (1)

Konrad Kokosa
Konrad Kokosa

Reputation: 16898

As the gcdump documentations explains: "GC dumps are created by triggering a GC in the target process, turning on special events, and regenerating the graph of object roots from the event stream".

Thus, it directly answers your question 2 - it triggers full GC, which may or may not be compacting, but it collects gen2 for sure. It also answers question 4 - it is not a "memory dump" but a special kind of diagnostics data about the objects graph (depndencies and typenames), without the data itself.

And regards to the questions 1 and 3 - it is an example of the GC being "not aggressive" enough. It is kind of the "living on the edge" problem when the process almost meets the containers limits and GC sometimes is not able to interpret it. In other words, it thinks it has enough space, but it doesn't. Please, be warned that this is a super-oversimplification. In such a case full GCs may not happen or happen too late. I would confirm that by observing the process by the dotnet-trace with gc-collect profile.

As a solution, consider setting the limit manually, by using GCHeapHardLimit, to some clearly smaller value like 8GB.

Upvotes: 3

Related Questions