I have the following algorithm (it searches for palindromes whose binary representation is also a palindrome): #!/usr/bin/env ruby class Runner attr_accessor :min, :max, :sum def initialize(min, max) @sum = 0 @min, @max = min, max end def run current = @min while current <= @max string = current.to_s if string == string.reverse && current.to_s(2) == current.to_s(2).reverse @sum += current end current += 1 end puts @sum end end t1 = Time.now Runner.new(*ARGV.first.split(/\.+/).map(&:to_i)).run puts "Time spent: #{(Time.now - t1).round(2)} s." I run it for a 10..10_000_000 range. The version I provided above results in: Time spent: 2.55 s. But if I change while current <= @max string = current.to_s if string == string.reverse && current.to_s(2) == current.to_s(2).reverse to while current <= @max string = current.to_s string_b = current.to_s(2) if string == string.reverse && string_b == string_b.reverse then I get Time spent: 6.25 s. While I would expect it to be faster (less calculations required) it is twice slower. At the same time #to_s without an argument works as expected and if I remove the variable and perform calculations twice: while current <= @max if current.to_s == current.to_s.reverse && current.to_s(2) == current.to_s(2).reverse then it works slower Time spent: 4.17 s. Looking at the source of #to_s doesn't help to understand it except the problem is related to the else branch of condition. But I can't find an issue in rb_scan_args or NUM2INT either. Can anyone explain why it works this way (I use MRI 2.1.1p76)?

Reputation: 2677

Fixnum#to_s performance issue

I have the following algorithm (it searches for palindromes whose binary representation is also a palindrome):

#!/usr/bin/env ruby
class Runner
  attr_accessor :min, :max, :sum

  def initialize(min, max)
    @sum = 0
    @min, @max = min, max
  end

  def run
    current = @min
    while current <= @max
      string = current.to_s
      if string == string.reverse && current.to_s(2) == current.to_s(2).reverse
        @sum += current
      end
      current += 1
    end
    puts @sum
  end
end

t1 = Time.now
Runner.new(*ARGV.first.split(/\.+/).map(&:to_i)).run
puts "Time spent: #{(Time.now - t1).round(2)} s."

I run it for a 10..10_000_000 range. The version I provided above results in:

Time spent: 2.55 s.

But if I change

    while current <= @max
      string = current.to_s
      if string == string.reverse && current.to_s(2) == current.to_s(2).reverse

    while current <= @max
      string = current.to_s
      string_b = current.to_s(2)
      if string == string.reverse && string_b == string_b.reverse

then I get

Time spent: 6.25 s.

While I would expect it to be faster (less calculations required) it is twice slower. At the same time #to_s without an argument works as expected and if I remove the variable and perform calculations twice:

    while current <= @max
      if current.to_s == current.to_s.reverse && current.to_s(2) == current.to_s(2).reverse

then it works slower

Time spent: 4.17 s.

Looking at the source of #to_s doesn't help to understand it except the problem is related to the else branch of condition. But I can't find an issue in rb_scan_args or NUM2INT either. Can anyone explain why it works this way (I use MRI 2.1.1p76)?

Upvotes: 2

Answers (3)

ianks

Reputation: 1768

Analyzing this code in a profiler, it seems the the majority of the time is spent inside of the fix_to_s() C call.

fix_to_s

Jumping into the assembly code of fix_to_s() we see that is spends the majority of its time on a single instruction movslq %edx, %rdx. Apparently, movslq is moves a long (32bit) into a quad (64bit).

movslq

In the version of your code with string2 being memoized, we have 3027 samples of the movslq, while in the non-memoized version we have only 700 samples! What gives?

It seems that in the memoized version, MORE string objects are being allocated, and thus freed by the ruby interpreter. Notice the time spent creating and freeing string objects.

allocating-strings

I ran some stats on object allocation using the allocation_stats gem, and they show that there are INDEED less strings being allocated in the non-memoized version. About 2/3 the amount.

test.rb String 20032954

vs.

test.rb String 30010967

When you explicitly set string2 = current.to_s(2) they have to allocate memory for you, as you may need it. However, in your code it will not necessarily ever have to execute the 2nd part of the && statement, so you avoid that allocation altogether.

Obviously, increasing the amount of allocations will slow things down in a number ways, including make cache misses more likely, more garbage collection, and more free()ing and malloc()ing which explains why the relationship of allocations/time is not linear.

Upvotes: 2

Todd Agulnick

Reputation: 1995

In Ruby when you have an expression like

a && b

a is evaluated first and, if false, b is not evaluated.

Base10 palindromes are rare, so for the vast majority of your cycles through the loop, the original code checks only that string == string.reverse is false and then continues to the next iteration of the loop. Thus you only do the to_s once and the pair of to_s(2) almost never.

The other sets of changes all call to_s more frequently, and thus do more work and consequently are slower. Doesn't seem so surprising to me.

Upvotes: 3

Ry-

Reputation: 225075

In your original version, current.to_s(2) is only calculated if string == string.reverse, because && short-circuits. The equivalent would be:

while current <= @max
  string = current.to_s
  if string == string.reverse
    string_b = current.to_s(2)
    if string_b == string_b.reverse

Upvotes: 3

Fixnum#to_s performance issue

Answers (3)

Related Questions