Reputation: 675
Why is garbage collection so much slower when a large number of mutable structs are loaded in memory as compared with non-mutable structs? The object tree should have the same size in both cases.
julia> struct Foo
a::Float64
b::Float64
c::Float64
end
julia> mutable struct Bar
a::Float64
b::Float64
c::Float64
end
julia> @time dat1 = [Foo(0.0, 0.0, 0.0) for i in 1:1e9];
9.706709 seconds (371.88 k allocations: 22.371 GiB, 0.14% gc time)
julia> @time GC.gc(true)
0.104186 seconds, 100.00% gc time
julia> @time GC.gc(true)
0.124675 seconds, 100.00% gc time
julia> @time dat2 = [Bar(0.0, 0.0, 0.0) for i in 1:1e9];
71.870870 seconds (1.00 G allocations: 37.256 GiB, 73.88% gc time)
julia> @time GC.gc(true)
47.695473 seconds, 100.00% gc time
julia> @time GC.gc(true)
41.809898 seconds, 100.00% gc time
Upvotes: 3
Views: 581
Reputation: 76
Non-mutable structs may be stored directly inside an Array. This will never happen for mutable structs. In your case, the Foo
objects are all stored directly in dat1
, so there is effectively just a single (albeit very large) allocation reachable after creating the Arary.
In the case of dat2
, every Bar
object will have its own piece of memory allocated for it and the Array will contain references to these objects. So with dat2
, you end up with 1G + 1 reachable allocations.
You can also see this using Base.sizeof
:
julia> sizeof(dat1)
24000000000
julia> sizeof(dat2)
8000000000
You'll see that dat1
is 3 times as large, as every array entry contains the 3 Float64
s directly, while the entries in dat2
take up just the space for a pointer each.
As a side note: For these kinds of tests, it is a good idea to use BenchmarkTools.@btime
instead of the built-in @time
. As it removes the compilation overhead from the result and also runs your code multiple times in order to give you a more representative result:
@btime dat1 = [Foo(0.0, 0.0, 0.0) for i in 1:1e6]
2.237 ms (2 allocations: 22.89 MiB)
@btime dat2 = [Bar(0.0, 0.0, 0.0) for i in 1:1e6]
6.878 ms (1000002 allocations: 38.15 MiB)
As seen above, this is particularly useful to debug allocations. For dat1
we get 2 allocations (one for the Array
instance itself and one for the chunk of memory where the array stores its data), while with dat2
we have an additional allocation per element.
Upvotes: 5