Reputation: 74
When all you've got is a hammer, everything looks like a nail. So it can be said of the Array#each
method in Ruby, before discovering the utility, elegance, and syntactic pleasure of Array#map
and Array#select
and other iterable methods. What I'm curious of is:
Why is there an actual increase in performance when using a more precise iterable method? Is this true in general?
For example, in
require 'benchmark'
array = (1..100000).to_a
puts Benchmark.measure {
100.times do
array.map { |el| el.even? }
end
}
puts Benchmark.measure {
100.times do
new_array = []
array.each do |el|
new_array << el.even?
end
end
}
# ruby bench.rb
# 0.450598 0.015524 0.466122 ( 0.466802)
# 0.496796 0.018525 0.515321 ( 0.516196)
Benchmark
always shows a temporal performance difference in favor of Array#map
. In the following code:
puts Benchmark.measure {
100.times do
array.select { |el| el.even? }
end
}
puts Benchmark.measure {
100.times do
new_array = []
array.each do |el|
if el.even?
new_array << el
end
end
end
}
# ruby bench.rb
# 0.405254 0.007965 0.413219 ( 0.413733)
# 0.471416 0.008875 0.480291 ( 0.481079)
Array#select
beats a jerry-rigged Array#each
every time.
So why do these more precise methods produce notably better performance? And is this a general axiom in Ruby and/or all languages?
Upvotes: 2
Views: 1446
Reputation: 11
While analyzing any algorithms, we mostly consider time complexity and space complexity. Before analyzing different algorithms to solve a specific task, the first and foremost thing is to design different algorithms that perform the same task and returns the same desired output.
Let's write a program that performs the same task (iterating through the array 100 times. That's it nothing else.) without storing any result (because I am not sure what kind of output you want)
Here is the code snippet for bench.rb file
require 'benchmark'
array = (1..100000).to_a
puts Benchmark.measure {
100.times do
array.map { |el| el.even? }
end
}
puts Benchmark.measure {
100.times do
array.each { |el| el.even? }
end
}
puts Benchmark.measure {
100.times do
array.select { |el| el.even? }
end
}
I have run this code 3 times and the results are as follows:
Output:
Attempt 1:
0.548562 0.021844 0.570406 ( 0.571088)
0.457079 0.000345 0.457424 ( 0.457774)
0.516487 0.010758 0.527245 ( 0.527843)
Attempt 2:
0.544863 0.021756 0.566619 ( 0.568487)
0.458062 0.000514 0.458576 ( 0.459249)
0.508665 0.010847 0.519512 ( 0.520401)
Attempt 3:
0.583084 0.022554 0.605638 ( 0.606023)
0.509447 0.000665 0.510112 ( 0.511088)
0.548483 0.012212 0.560695 ( 0.561534)
I can see Array#each
as the clear winner based on the written example. The output may vary based on your requirement but the ground rule should be the same that the algorithms should return the same desired output.
Upvotes: 1
Reputation: 369458
In both of your examples, the second piece of code allocates 100 times as much memory as the first piece of code. It also performs approximately log_1.5(100) resizes of the array (assuming a standard textbook implementation of a dynamic array with a growth factor of 1.5). Resizing an array is expensive (allocating a new chunk of memory, then an O(n) copy of all elements into the new chunk of memory). More generally, garbage collectors hate mutation, they are much more efficient at collecting lots of small short-lived objects than keeping alive a few large long-lived objects.
In other words, in the first example, you are measuring Array#map
and Array#select
, respectively, whereas in the second example, you are not only measuring Array#each
, but also Array#<<
as well as array resizing and memory allocation. It is impossible to tell from the benchmarking results, which of those contributes how much. As Zed Shaw once put it: "If you want to measure something, then don't measure other shit".
But even if you fix that bug in your benchmark, generally speaking more specialized operations have more information available than general ones, so the more general operations can typically not be faster than the specialized ones.
In your specific example it may just be something very simple such as, you are using a Ruby implementation that is not very good at optimizing Ruby code (such as YARV, and unlike e.g. TruffleRuby) while at the same time have an optimized native implementation of Array#map
and Array#select
(again, take YARV as an example, which has C implementations for both of those, and is generally not capable of optimizing Ruby code very well).
And lastly, writing correct microbenchmarks is hard. Really, really, really hard. I encourage to read and understand this entire discussion thread on the mechanical-sympathy Mailing list: JMH vs Caliper: reference thread. While it is specifically about Java benchmarking (actually about JVM benchmarking), many of the arguments apply to any modern high-performance OO execution engine such as Rubinius, TruffleRuby, etc. and to a lesser extent also to YARV. Note that most of the discussion is about writing microbenchmark harnesses, not writing microbenchmarks per se, i.e. it is about writing frameworks that allow developers to write correct microbenchmarks without having to know about that stuff, but unfortunately, even with the best microbenchmark harnesses (and Ruby's Benchmark
is actually not a very good one), you still need to have a very deep understanding of modern compilers, garbage collectors, execution engines, CPUs, hardware architectures, but also statistics.
Here is a good example of a failed benchmark that may not be obvious to the untrained benchmark writer: Why is printing “B” dramatically slower than printing “#”?.
Upvotes: 5
Reputation: 81
well in the second case of both examples, there's an assignment during each iteration. The first one isn't assigning anything.
Upvotes: 0