Reputation: 7326
I tried to do some quick benchmarking using Benchfella:
defmodule ConcatListBench do
use Benchfella
@a1 Enum.to_list(1..10_000)
@a2 Enum.to_list(10_000..20_0000)
bench "++" do
@a1 ++ @a2
end
bench "Enum.concat" do
Enum.concat(@a1, @a2)
end
end
And while running it:
$ elixir -v
Erlang/OTP 19 [erts-8.0.2] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Elixir 1.4.0-dev (762e7de)
$ mix bench
Settings:
duration: 1.0 s
## ConcatListBench
[10:01:09] 1/2: ++
[10:01:20] 2/2: Enum.concat
Finished in 14.03 seconds
## ConcatListBench
benchmark na iterations average time
++ 1000000000 0.01 µs/op
Enum.concat 50000 45.03 µs/op
The question is how Enum.concat
could be slower (over 4k times) if it uses ++
operator internally for lists ?
I understand that guard clauses in Enum.concat
and pattern matching costs some time, but benchmark shows a big difference, isn't it ?
UPDATE: This happens due to Constant Folding, concatenation using ++
optimized at compile time and takes instant time to be run. So the benchmark is not quite realistic.
Upvotes: 10
Views: 1885
Reputation: 222158
Short answer: Constant Folding.
Longer answer: Module attributes in Elixir are replaced with their literal values when Elixir is compiled to beam
files. For example, the following code:
defmodule ConcatListBench do
@a1 Enum.to_list(1..10)
@a2 Enum.to_list(10..20)
def plusplus, do: @a1 ++ @a2
def concat, do: Enum.concat(@a1, @a2)
end
compiles to:
-module('Elixir.ConcatListBench').
...
concat() ->
'Elixir.Enum':concat([1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]).
plusplus() ->
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] ++
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20].
The Erlang compiler's sys_core_fold
module, which does constant folding optimization, evaluates ++
operations as much as possible at compile time. Since in this case, both the lists are literals, it can completely eliminate the function call and replace it with the resulting list. So in your benchmark, the ++
function is just returning a list that already exists in the VM. It's as fast as doing 1 + 2
(which is also constant folded to 3
):
...
bench "1 + 2" do
1 + 2
end
...
## ConcatListBench
benchmark na iterations average time
1 + 2 1000000000 0.01 µs/op
++ 1000000000 0.01 µs/op
Enum.concat 50000 37.89 µs/op
A more realistic benchmark would be to do an indirect call to ++
which the Erlang compiler does not fold:
def plus_plus(a, b), do: a ++ b
bench "++" do
plus_plus(@a1, @a2)
end
These are the outputs of 3 runs:
## ConcatListBench
benchmark na iterations average time
Enum.concat 50000 37.44 µs/op
++ 50000 41.65 µs/op
## ConcatListBench
benchmark na iterations average time
++ 50000 36.07 µs/op
Enum.concat 50000 38.58 µs/op
## ConcatListBench
benchmark na iterations average time
Enum.concat 50000 39.34 µs/op
++ 50000 40.74 µs/op
So really, if your lists aren't constant at compile time, both ways are just as fast. I would expect Enum.concat
to be a tiny bit slower (especially for small lists) as it does a bit more work than ++
.
Upvotes: 21