Reputation: 2444
I am curious to know how sort_by
and sort_by!
are implemented. I looked at some of the implementation code, but wasn't able to parse through the immense number of macros (that in turn yielded more macros) to figure out what was actually happening.
Here are the results from my benchmarking:
sort_by -a[:bar] 8.830000 0.010000 8.840000 ( 8.847953)
sort_by a[:bar]*-1 8.620000 0.010000 8.630000 ( 8.628056)
sort_by(&:bar).reverse! 8.550000 0.000000 8.550000 ( 8.558410)
vs:
sort_by! -a[:bar] 2.800000 0.000000 2.800000 ( 2.800778)
sort_by! a[:bar]*-1 2.690000 0.000000 2.690000 ( 2.692756)
sort_by!(&:bar).reverse! 2.470000 0.010000 2.480000 ( 2.480710)
I'm curious to know why there's such a significant difference. One hypothesis I had is around memory allocation that sort_by
has to do. But here's my benchmarking code, you can see that it's an array I'm sorting (i.e. allocation can happen once, array size is known).
#!/usr/bin/ruby
require 'benchmark'
foo = []
for i in 0..10000
foo << {:bar => rand(10000)}
end
Benchmark.bm(20) do |b|
b.report("sort_by! -a[:bar]") { 1000.times { foo.sort_by!{ |a| -a[:bar] } } }
b.report("sort_by! a[:bar]*-1") { 1000.times { foo.sort_by!{ |a| a[:bar]*-1 } } }
b.report("sort_by!(&:bar).reverse!") { 1000.times { foo.sort_by!{ |a| a[:bar] }.reverse! } }
end
Upvotes: 1
Views: 104
Reputation: 160551
Your benchmarks don't reflect the outputs, plus they are in need of simplification. When you test keep it simple and test only what you want to know about:
require 'benchmark'
BAR = (1..10_000).to_a.shuffle
n = 1000
Benchmark.bmbm(8) do |b|
b.report("sort_by") { n.times { foo = BAR.dup; foo = foo.sort_by{ |a| a }; foo } }
b.report("sort_by!") { n.times { foo = BAR.dup; foo.sort_by!{ |a| a }; foo } }
end
Resulting in:
Rehearsal --------------------------------------------
sort_by 6.350000 0.010000 6.360000 ( 6.364429)
sort_by! 6.620000 0.000000 6.620000 ( 6.615428)
---------------------------------- total: 12.980000sec
user system total real
sort_by 6.350000 0.000000 6.350000 ( 6.353188)
sort_by! 6.840000 0.010000 6.850000 ( 6.842521)
The environment probably hadn't stabilized in your tests, plus you were throwing in changes that confuse the tests.
It's important to duplicate everything, with only one thing different, to test just the thing you want to know about. sort_by!
mutates the array. sort_by
returns a new array so, to have the same result, you should assign the result to the array also.
This test bears repeating:
require 'benchmark'
BAR = (1..10_000).to_a.shuffle
n = 1000
Benchmark.bmbm(12) do |b|
b.report("sort (no block)") { n.times { foo = BAR.dup; foo = foo.sort; foo } }
b.report("sort! (no block)") { n.times { foo = BAR.dup; foo.sort!; foo } }
b.report("sort (block)") { n.times { foo = BAR.dup; foo = foo.sort{ |a, b| a <=> b }; foo } }
b.report("sort! (block)") { n.times { foo = BAR.dup; foo.sort!{ |a, b| a <=> b }; foo } }
b.report("sort_by") { n.times { foo = BAR.dup; foo = foo.sort_by{ |a| a }; foo } }
b.report("sort_by!") { n.times { foo = BAR.dup; foo.sort_by!{ |a| a }; foo } }
end
Resulting in:
Rehearsal ----------------------------------------------------
sort (no block) 1.250000 0.010000 1.260000 ( 1.253412)
sort! (no block) 1.240000 0.010000 1.250000 ( 1.254230)
sort (block) 12.380000 0.010000 12.390000 ( 12.378503)
sort! (block) 12.390000 0.000000 12.390000 ( 12.399870)
sort_by 6.410000 0.010000 6.420000 ( 6.408380)
sort_by! 6.720000 0.000000 6.720000 ( 6.727324)
------------------------------------------ total: 40.430000sec
user system total real
sort (no block) 1.240000 0.010000 1.250000 ( 1.249624)
sort! (no block) 1.230000 0.020000 1.250000 ( 1.241353)
sort (block) 12.320000 0.010000 12.330000 ( 12.341552)
sort! (block) 12.390000 0.010000 12.400000 ( 12.397626)
sort_by 6.410000 0.010000 6.420000 ( 6.411413)
sort_by! 6.940000 0.000000 6.940000 ( 6.943647)
There are two "take-aways" from this:
sort
outruns sort_by
when comparing basic objects without a block so use the sort
family without a block if you can.sort_by
family.Upvotes: 1
Reputation: 80065
I am guessing here: sort_by!
is defined on Array and thus can be optimized for arrays, while sort_by
is defined on Enumerable, so it has to be able to handle all kinds of enumerables.
Upvotes: 0