Reputation: 1532
Consider the following simple program in Julia:
function foo_time(x)
@time x.^2
return nothing
end
n = 1000;
foo_time(collect(1:n));
If I run that in my console, then @time
reports 1 allocation, which is what I expect. However, if I change n
to 10000
, then @time
reports 2 allocations.
What is more, if I chain together functions without syntactic loop fusion (in other words, without dots) then I seem to get double the expected allocations. For example, writing (x + x).^2 + x
instead of x.^2
yields 3 allocations with n = 1000
, but it yields 6 allocations with n = 10000
. (The pattern does not strictly continue though: for instance, (x + x + x).^2
only yields 5 allocations for n = 10000
.)
Why should the size of the vector affect how many allocations occur? What is going on under the hood here?
This occurs both in the JupyterLab console and in the normal Julia REPL.
Upvotes: 4
Views: 175
Reputation: 1446
I agree with Matt, for this simple task the number of allocations is not a good indicator.
If you want to dive into the details and understand precisely how your code is compiled and executed, I suggest you theses macros @code_llvm
, @code_lowered
, @code_native
, @code_typed
and @code_warntype
. All the subtilities between theses macros are explained in detail in the Julia doc here and there.
julia> f(x) = x.^2
f (generic function with 1 method)
julia> @code_lowered f(randn(10000))
CodeInfo(
1 ─ %1 = (Core.apply_type)(Base.Val, 2)
│ %2 = (%1)()
│ %3 = (Base.broadcasted)(Base.literal_pow, Main.:^, x, %2)
│ %4 = (Base.materialize)(%3)
└── return %4
)
julia> f2(x) = (x + x).^2 + x
f2 (generic function with 1 method)
julia> @code_lowered f2(randn(10000))
CodeInfo(
1 ─ %1 = x + x
│ %2 = (Core.apply_type)(Base.Val, 2)
│ %3 = (%2)()
│ %4 = (Base.broadcasted)(Base.literal_pow, Main.:^, %1, %3)
│ %5 = (Base.materialize)(%4)
│ %6 = %5 + x
└── return %6
)
Upvotes: 1
Reputation: 31362
Why is there one allocation with small vectors and two allocations with big vectors?
Really, this doesn't matter and is an internal detail for how arrays work. Essentially there are two parts of a Julia Array
: the internal header (which keeps track of the array's dimensionality and element type and such), and the data itself. When the arrays are small, there's an advantage in bundling these two data segments together, but when the arrays are big, there's an advantage in keeping them separate. This isn't a broadcasting thing, it's just an Array allocation thing:
julia> f(n) = (@time Vector{Int}(undef, n); nothing)
f (generic function with 1 method)
julia> f(2048)
0.000003 seconds (1 allocation: 16.125 KiB)
julia> f(2049)
0.000003 seconds (2 allocations: 16.141 KiB)
Then hopefully you can see why this leads to double the number of allocations for large arrays when there are temporaries involved — there's one for each array's header and one for each array's data.
In short — don't worry too much about the number of allocations. There are times when allocations can actually improve performance. When to be concerned, however, is when you see a huge number of allocations — especially if you can see that they're proportional to the number of elements in the array.
Upvotes: 6