Reputation: 53
I have a simple question. I have defined a struct, and I need to inititate a lot (in the order of millions) of them and loop over them.
I am initiating one at a time and going through the loop as follows:
using Distributions
mutable struct help_me{Z<:Bool}
can_you_help_me::Z
millions_of_thanks::Z
end
for i in 1:max_iter
tmp_help = help_me(rand(Bernoulli(0.5),1)[1],rand(Bernoulli(0.99),1)[1])
# many follow-up processes
end
The memory allocation scales up in max_iter. For my purpose, I do not need to save each struct. Is there a way to "re-use" the memory allocation used by the struct?
Upvotes: 5
Views: 586
Reputation: 12654
Your main problem lies here:
rand(Bernoulli(0.5),1)[1], rand(Bernoulli(0.99),1)[1]
You are creating a length-1 array and then reading the first element from that array. This allocates unnecessary memory and takes time. Don't create an array here. Instead, write
rand(Bernoulli(0.5)), rand(Bernoulli(0.99))
This will just create random scalar numbers, no array.
Compare timings here:
julia> using BenchmarkTools
julia> @btime rand(Bernoulli(0.5),1)[1]
36.290 ns (1 allocation: 96 bytes)
false
julia> @btime rand(Bernoulli(0.5))
6.708 ns (0 allocations: 0 bytes)
false
6 times as fast, and no memory allocation.
This seems to be a general issue. Very often I see people writing rand(1)[1]
, when they should be using just rand()
.
Also, consider whether you actually need to make the struct mutable
, as others have mentioned.
Upvotes: 4
Reputation: 1158
If the structure is not needed anymore (i.e. not referenced anywhere outside the current loop iteration), the Garbage Collector will free up its memory automatically if required.
Otherwise, I agree with the suggestions of Oscar Smith: memory allocation and garbage collection take time, avoid it for performance reasons if possible.
Upvotes: 1