Max Williams
Max Williams

Reputation: 32945

Algorithm to spread selection over a fixed size array

It's not specifically a ruby problem: more of a general question about algorithms. But there might be some ruby-specific array methods which are helpful.

I have an array with 30 items. I ask for a number of items between 15 and 30, and I want to select a given number of items from the whole array as evenly distributed as possible. The selection needs to be non-random, returning the same result every time.

Let's say someone asks for 16 items. If I return the first 16, that would be a massive fail. Instead, I could return all the odd numbered ones plus the last one; If I had the numbers 1 to 30 stored in the array, I could give back

myArr.spread(16)
=> [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,30]

If someone asks for 20 items, its a bit tricker: I can't immediately think of a nice programmatic way of doing this. I feel like it must have been solved already by someone. Any suggestions?

Upvotes: 2

Views: 1922

Answers (5)

marxarelli
marxarelli

Reputation: 101

Having recently implemented this method—although I called it keep—for use in a backup retention application, I thought I'd share my solution. It's similar to Alex D's answer with two major differences in the algorithm:

  1. The "stride" is calculated using (length + (length / n) - 1).to_f / n where n is the number of items desired. Calculating an offset in terms of the number of times n goes into length ensures that the last item is always included.

  2. It uses a modulo operation instead of incrementing: If the element's index divided by the "stride" has a remainder between 0 and 1 (inclusive of 0, exclusive of 1), the element is included in the result. The fact that 0 % x is always 0 ensures that the first element is always returned.

  3. Edge cases, such as when the number of elements is less than the number desired, are accounted for.


class Array
  def keep(n)
    if n < 1
      []
    elsif length <= n
      self.clone
    else
      stride = (length + (length / n) - 1).to_f / n
      select.with_index do |_, i|
        remainder = i % stride
        (0 <= remainder && remainder < 1)
      end
    end
  end
end

Upvotes: 2

Max Williams
Max Williams

Reputation: 32945

I ended up doing this, inspired by Alex D: i step through n-1 times and then always add the last element to the end.

class Array
  def spread(n)
    step = self.length.to_f / (n -1) 
    (0..(n-2)).to_a.collect{|i| self[i * step]} + [self.last]
  end
end

 > (1..30).to_a.spread(3)
 => [1, 16, 30] 
 > (1..30).to_a.spread(4)
 => [1, 11, 21, 30] 
 > (1..30).to_a.spread(5)
 => [1, 8, 16, 23, 30] 
 > (1..30).to_a.spread(15)
 => [1, 3, 5, 7, 9, 11, 13, 16, 18, 20, 22, 24, 26, 28, 30] 

Upvotes: 3

Justin Ko
Justin Ko

Reputation: 46836

There was a similar question for this here, but the solution is in python.

In Ruby, it would be:

class Array
    def spread( count)
        length = self.length
        result = Array.new
        0.upto(count-1) do |i|
            result << self[(i * length.to_f / count).ceil]
        end
        return result
    end
end

arr = Array(1..30)
puts arr.spread(20)
#=> [1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, 30]

Upvotes: 1

Alex D
Alex D

Reputation: 30455

Divide the size of the array by the number of items you want to select (DON'T use truncating division) -- this will be your "stride" as you walk over the array, selecting items. Keep adding the "stride" to a running total until it equals or exceeds the size of the array. Each time you add the "stride", take the integral part and use it as an index into the array to select an item.

Say you have 100 items and you want to select 30. Then your "stride" will be 3.3333... so you start with a "running total" of 3.3333, and select item 3. Then 6.66666 -- so you select item 6. Next is 10.0 -- so you select item 10. And so on...

Test to make sure you don't get "off by one" errors, and also that you don't divide by zero if the array size or number of items to select is zero. Also use a guard clause to ensure that the number of items to select is not greater than the number in the array.

Upvotes: 1

Baldrick
Baldrick

Reputation: 24340

You could try to use a Random (doc) with a fixed seed:

  • with the Random object, you can pick the elements of the array randomly
  • the fixed seed ensure that every call to the function will generate the list of random numbers.

For example with Array#sample

def spread(arr, count) do
    arr.sample(count, Random.new(0))
end

Upvotes: -1

Related Questions