Automatico
Automatico

Reputation: 12916

Looping through bits in an integer, ruby

I am making a program where one of the problems is that I need to do some analysis of the bit pattern in some integers.

Because of this I would like to be able to do something like this:

#Does **NOT** work:
num.each_bit do |i|
   #do something with i
end

I was able to make something that works, by doing:

num.to_s(2).each_char do |c|
   #do something with c as a char
end

This however does not have the performance I would like.

I have found that you can do this:

0.upto(num/2) do |i|
   #do something with n[i]
end

This have even worse performance than the each_char method

This loop is going to be executed millions of times, or more, so I would like it to be as fast as possible.

For reference, here is the entirety of the function

@@aHashMap = Hash.new(-1)

#The method finds the length of the longes continuous chain of ones, minus one 
#(101110 = 2, 11 = 1, 101010101 = 0, 10111110 = 4)

def afunc(n) 
if @@aHashMap[n] != -1
    return @@aHashMap[n]
end

num = 0
tempnum = 0
prev = false

(n.to_s(2)).each_char do |i|
    if i
        if prev
            tempnum += 1
            if tempnum > num
                num = tempnum
            end
        else
            prev = true
        end
    else
        prev = false
        tempnum = 0
    end
end

@@aHashMap[n] = num
return num
end

Upvotes: 12

Views: 5063

Answers (8)

Cary Swoveland
Cary Swoveland

Reputation: 110685

Here are a couple more straightforward methods (though I expect @Steven's answer and my other answer would be more efficient).

#1

def max_string_of_ones(n)
  max_length = 0
  cand = 0
  (0..n.bit_length).reduce(0) do |max_length, i|
    if n[i] == 1
      cand += 1
    else
      max_length = cand if cand > max_length
      cand = 0
    end
    max_length
  end
end

Note n[n.bit_length] #=> 0.

#2

This second method uses Ruby's flip-flop operator. In addition, I thought, "Wouldn't it be handy if Integer had an each_bit method that returned an enumerator?", so I added one.

class Integer
  def each_bit
    Enumerator.new do |yielder|
      if block_given?      
        bit_length.times { |i| yielder << yield(self[i]) }
      else
        bit_length.times { |i| yielder << self[i] }
      end
    end
  end
end

def max_string_of_ones(n)
  n.each_bit.slice_before { |b| true if b==0 .. b==1 }.max_by(&:size).size
end

max_string_of_ones(0b1100011101111011100)
  #=> 4

Note the following intermediate calculation.

def max_string_of_ones(n)
  n.each_bit.slice_before { |b| true if b==0 .. b==1 }
end

enum = max_string_of_ones(0b1100011101111011100)
  #=> #<Enumerator: #<Enumerator::Generator:0x00000000019a2f80>:each>
enum.to_a
  #=> [[0], [0], [1, 1, 1], [0], [1, 1, 1, 1], [0],
  #    [1, 1, 1], [0], [0], [0], [1, 1]]

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110685

Algorithm

One might consider using a binary search. I have implemented the following algorithm to determine the length of the longest string of 1-bits in a given non-negative integer n. The computational complexity is O(n), but for many values of n it will approach O(log2(n)).

The steps of the algorithm are below, but the reader many find them easier to follow by reading the following section ("Illustration") first.

  1. Set longest to 0.
  2. Set len equal to the number of bits in n (len = n.bit_length).
  3. If len <= 2, return the answer (0, 1 or 2).
  4. Determine the index mid of the middle bit of n (mid = len/2 or mid = len >> 1).
  5. Set ri = mid+1 and li = mid-1.
  6. If the value of the middle bit (n[mid]) is zero go to Step 10.
  7. (n[mid] = 1 to reach this step.) Determine the largest index li >= mid and the smallest index ri <= mid such that n[i] = 1 for all ri <= i <= li.
  8. Set longest = [longest, ri-li+1].max, li += 1 and ri -= 1.
  9. If li == len go to Step 10; else set ln equal to the number comprised of bits at indices li (least significant) to len-1 (most significant). Recursively set to n to ln and go to step 2. This returns the longest string of 1-bits in the number ln (cand). Set longest = [longest, cand].max.
  10. If ri < 0 go to Step 11; else set rn equal to the number comprised of bits at indices 0 (least significant) to ri (most significant). Recursively set to n to rn and go to step 2. This returns the longest string of 1-bits in thet number rn (cand). Setlongest = [longest, cand].max`.
  11. Return longest in the recursion.

Illustration

Suppose

n = 0b1010011011101011110
  #=> 341854

Then

len = n.bit_length
  #=> 19
mid = len >> 1
  #=> 9
n[mid]
  #=> 1
longest = 0

We may picture n as follows

101001101 1 101011110

where the middle bit 1 stands out. Since it is a 1, we look left and right for sequences of 1's. We obtain the following.

10100110 111 01011110

As we have found three 1's, we update longest.

longest = [longest, 3].max
  #=> [0, 3].max => 3

We must now examine 0b10100110 and 0b1011110 (the leading zero in the second number drops out).

For the left number we compute the following.

n = 0b10100110
len = n.bit_length
  #=> 8
mid = len >> 1
  #=> 4
n[mid]
  #=> 0

so we have

101 0 0110

(Note n[0] is the least-significant bit). We can rule out both 0b101 and 0b110, since both have 3 bits, which does not exceed the current valule of longest, which is 3.

Now we consider the right half,

n = 0b1011110
len = n.bit_length
  #=> 7
mid = len >> 1
  #=> 3
n[mid]
  #=>1

so we have

101 1 110

As n[mid] #=> 1, we look left and right for strings of 1's and obtain

10 1111 0

As we have discovered a sting of 4 1's, we set

longest = [longest, 4].max
  #=> [3, 4].max = 4

Lastly we see that the numbers of bits on the left (2) and on the right (1) are both less than 3, so we are finished, and return longest #=> 4. (The OP actually wants longest - 1 #=> 3, which we regard as a side-calculation.)

Code

def longest_run(n, longest=0)
  len = n.bit_length
  return [longest, (n & 1) + (n >> 1)].max if len <= 2
  mid = len >> 1
  ri = mid-1
  li = mid+1
  if n[mid] == 1
    until n[ri] == 0
      ri -= 1
    end
    until n[li] == 0
      li += 1
    end
    cand = li-ri-1
    longest = cand if cand > longest
  end
  if ri >= 0
    shift = ri+1
    m = n ^ ((n >> shift) << shift)
    if m.bit_length > longest 
      cand = longest_run(m, longest) 
      longest = cand if cand > longest
    end
  end
  if li <= len - 1
    m = n >> li
    if m.bit_length > longest 
      cand = longest_run(m, longest) 
      longest = cand if cand > longest
    end
  end
  longest
end

Examples

longest_run 341854
  #=> 4
longest_run 0b1011110011111000000111100011101111011110
  #=> 5
longest_run 0b101111001111100000011110001110111111011111
  #=> 6
longest_run 2**500_000-1
  #=> 500000
longest_run ("10"*100_000).to_i(2)
  #=> 1

Upvotes: 0

Stefan
Stefan

Reputation: 114188

To determine the length of the longest sequence of consecutive 1's, this is more efficient:

def longest_one_chain(n)
  c = 0
  while n != 0
    n &= n >> 1
    c += 1
  end
  c
end

The method simply counts how many times you can "bitwise AND" the number with itself shifted 1 bit to the right until it is zero.

Example:

                 ______ <-- longest chain
    01011011100001111110011110101010 c=0
AND  0101101110000111111001111010101
        1001100000111110001110000000 c=1, 1’s deleted
AND      100110000011111000111000000
            100000011110000110000000 c=2, 11’s deleted
AND          10000001111000011000000
                    1110000010000000 c=3, 111’s deleted
AND                  111000001000000
                     110000000000000 c=4, 1111’s deleted
AND                   11000000000000
                      10000000000000 c=5, 11111’s deleted
AND                    1000000000000
                                   0 c=6, 111111’s deleted

Upvotes: 11

tadman
tadman

Reputation: 211610

Sometimes using strings is the most obvious method, and the performance is tolerable:

def oneseq(n)
  n.to_s(2).split(/0+/).sort_by(&:length).last.to_s.length
end

Upvotes: 1

J. Holmes
J. Holmes

Reputation: 18546

If you are looking for performance, then building a look-up table will probably be the most performant way. Especially if you are doing these in a tight loop:

class BitCounter
    def initialize
        @lookup_table = (0..65535).map { |d| count_bits(d) }
    end

    def count(val)
        a,b,c = @lookup_table[val & 65535]
        d,e,f = @lookup_table[val >> 16]
        [a,b,c+d,e,f].max
    end

private

    def count_bits(val)
        lsb = lsb_bits(val)
        msb = msb_bits(val)
        [lsb, inner_bits(val, lsb, msb), msb]
    end

    def lsb_bits(val)
        len = 0
        while (val & 1 == 1) do
            val >>= 1
            len += 1
        end
        len
    end

    def msb_bits(val)
        len = 0
        while (val & (1<<15) == (1<<15)) do
            val <<= 1
            len += 1
        end
        len
    end

    def inner_bits(val, lsb, msb)
        lens = []
        ndx = lsb

        len = 0
        (lsb+1..(15-msb)).each do |x|
            if ((val & (1<<x)) == 0)
                if(len > 0)
                    lens << len
                    len = 0
                end
            else
                len += 1
            end
        end
        lens.max || 0
    end
end

And then an example:

counter = BitCounter.new
p counter.count 0b01011011100001111110011110101010  // 6

This basically creates a loopup table for all 16 bit values, and then calculates the largest result from those cached values.

You could even combine the more expressive form of n.to_s(2).scan(/1+/).sort.last.length - 1 rather than doing bitwise logic in your table initialization, since it is no longer the bottleneck point -- although I would stick with bitwise math just for clarity of expression rather than string parsing. Each look up only costs 2 table look ups, one addition and a max

Upvotes: 1

J&#246;rg W Mittag
J&#246;rg W Mittag

Reputation: 369468

In Ruby, Integers (i.e. both Bignums and Fixnums) can already be indexed as if they were bit arrays. They are, however, not Enumerable.

But you can fix that, of course:

class Integer
  include Enumerable

  def each
    return to_enum unless block_given?      
    (size*8).times {|i| yield self[i] }
  end
end

A slightly less intrusive way might be to represent the Integer as an array:

class Integer
  def to_a
    Array.new(size*8, &method(:[]))
  end
end

Then you can use Ruby's nifty Enumerable methods:

0b10111110.chunk {|b| true if b == 1 }.map(&:last).max_by(&:size).size - 1

(Or 0b10111110.to_a.chunk … if you prefer the less intrusive method.)

If you are worried about performance, the execution engine you choose makes a big difference. Rubinius's or JRuby's optimizing compiler may be able to inline and optimize away many method calls that YARV's rather simple compiler can't, for example. YARV's special treatment of Fixnum may give it an advantage over MRI.

As you can see from the examples, I am a big fan of point-free style and functional programming. If you can prove via profiling that you have a bottleneck at a specific point in the code, you may need to replace it with a slightly less elegant or impure version, or you may want to hand-fuse the map and max_by.

class Integer
  def to_a
    Array.new(size*8) {|i| self[i] }
  end
end

0b10111110.chunk {|b| true if 1 == b }.map {|key, chunk| chunk.size }.max - 1

or

0b10111110.chunk {|b| true if 1 == b }.max_by {|key, chunk| chunk.size }.last.size - 1

Upvotes: 3

Meier
Meier

Reputation: 3880

Be aware that o and "0" all have a boolean value of true in ruby, so "if i" will not give the result you intended.

Converting each number to a string is of course something one should avoid.

Fixnum has a method [] to access bits of the number, so this has the chance to be faster.

If you have tried this with

0.upto(num/2) do |i|
   #do something with n[i]
end

then num/2 is probably much too big, so you loop much too often.

For 32 bit integers you should use

0.upto(31) do |i|
   if n[i] == 1
     ...
   end
end

Upvotes: 3

pguardiario
pguardiario

Reputation: 54984

Ruby might not be a good choice for your project. The strength of ruby is not it's performance but that it lets you do things like:

n.to_s(2).scan(/1+/).sort.last.length - 1

instead of writing mountains of code. Really just about any other language is likely to perform better if you don't mind writing complex code (which you don't seem to).

Upvotes: 4

Related Questions