Reputation: 5462
I've found that counting with Enum.map |> Enum.sum
works much faster than Enum.count
. But why is not the built-in count function efficient?
Compare these two lines:
elements |> Enum.count(&check/1)
elements |> Enum.map(&(if check(&1), do: 1, else: 0)) |> Enum.sum
The results of tests with different length of list:
len | map(), µs | count(), µs
===============================
10 | 0.67 | 2.55
100 | 5.52 | 8.91
1000 | 59.00 | 73.12
10000 | 642.50 | 734.60
Source code: https://gist.github.com/artemrizhov/fc146f7ab390f7a807be833099e5cb83
$ elixir --version
Erlang/OTP 19 [erts-8.1] [source-e7be63d] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Elixir 1.3.4
Upvotes: 10
Views: 3872
Reputation: 11278
This is because both Enum.map/2
and Enum.reduce/3
(that is used by sum
) are heavily optimised for lists, while Enum.count/2
has only a generic enumerable version.
To add to confusion, there's even a faster implementation:
elements |> Enum.filter(&check/1) |> Enum.count
On my machine, using modified benchmark you provided I get a consistent result of:
len = 10
map: 2.1706 μs
count: 7.0754 μs
filter: 1.9765 μs
len = 100
map: 19.921 μs
count: 27.579 μs
filter: 14.591 μs
len = 1000
map: 168.93 μs
count: 178.93 μs
filter: 163.43 μs
len = 10000
map: 2128.9 μs
count: 1822.1 μs
filter: 1664.6 μs
That said both the "map" and the "filter" version produce more garbage when they operate, so some of the time gains might be consumed by the extended garbage collection time. This is already visible in the benchmarks, as the relative gains between the versions decrease as the length of the length of the list (and the amount of intermediate garbage produced) increase.
EDIT: I submitted a PR that optimises the Enum.count/2
function to be the fastest one https://github.com/elixir-lang/elixir/pull/5567
Upvotes: 20