Nikolay D
Nikolay D

Reputation: 347

Ruby recursive shuffle with rand

I am quite new to Ruby and programming in general, and I am currently trying to study with Chris Pine's book. In chapter 10 there is a task that requires you to return a shuffled version of an array.

I tried the following code for the shuffle method itself, and it worked:

def shuffle array
  recursive_shuffle array, []   
end

def recursive_shuffle unshuffled, shuffled
  if unshuffled.length == 1
    shuffled.push unshuffled[0]
    shuffled.pop                      
    return shuffled
  end

  rand_index = rand(unshuffled.length - 1)
  shuffled.push unshuffled[rand_index]
  unshuffled.delete_at(rand_index)
  recursive_shuffle(unshuffled, shuffled)
end

puts(shuffle(array))

I'm wondering about two things.

First I tried doing it without shuffled.pop and with rand_index = rand(unshuffled.length) instead of length - 1.

However, in the answer the element before the last one was always blank, otherwise it was shuffled and no elements were missing. Then I thought the problem should be in the rand and added the - 1.

At this point the array was, again, shuffled properly, but this time there was always one blank last element of the shuffled array. That's why I added shuffled.pop.

It works fine like this, but I don't understand why it adds a blank element in the first two instances.

Also, if I have an array of 2 elements, its length should be 2 and rand_index should be rand(1), but isn't rand(1) supposed to always be 0? The aforementioned code nevertheless shuffles properly arrays with 2 elements.

I looked at various other topics on Ruby, shuffle, and rand, and couldn't find a similar problem.

Upvotes: 3

Views: 306

Answers (3)

dinjas
dinjas

Reputation: 2125

The first block in your shuffle method:

if unshuffled.length == 1
  shuffled.push unshuffled[0]
  shuffled.pop                      
  return shuffled
end

Doesn't do what you want it to be doing.

If unshuffled.length is 1, then there is only one element in the array, so you can just add it to your shuffled array and return the shuffled array with something like:

if unshuffled.length == 1
  return shuffled.push(unshuffled.pop)
end

or, sticking with your original approach:

if unshuffled.length == 1
  shuffled.push unshuffled[0]                   
  return shuffled
end

Also, calling push pushes a new element onto the end of the array that you're calling push on. E.g. if a = [1, 2], then a.push(3) yields [1, 2, 3].

Calling pop on an array, "pops" the last element off of the array. E.g. if a = [1, 2, 3] and you call a.pop, then a == [1, 2].

Putting these two concepts together, you can see that if you push something onto an array and then pop it right back off, you're not accomplishing much. E.g. if you call a = [1, 2] and then a.push(3), then a looks like [1, 2, 3]. Calling a.pop puts you back at [1, 2].

The other issue is the call to rand (as @pjs mentioned in the comments). rand will return a random integer less than the (non-zero) integer you supply, so there's no need to decrement the length you're passing to it. The rand line can be changed to: rand_index = rand(unshuffled.length - 1)

Integrating this change should leave you with something similar to this:

def recursive_shuffle unshuffled, shuffled
  if unshuffled.length == 1
    return shuffled.push(unshuffled.pop)
  end

  rand_index = rand(unshuffled.length)
  shuffled.push unshuffled[rand_index]
  unshuffled.delete_at(rand_index)
  recursive_shuffle(unshuffled, shuffled)
end

Which should work fine.

This could be shortened further to something like:

def recursive_shuffle(unshuffled, shuffled = [])
  return shuffled.push(unshuffled.pop) if unshuffled.length == 1
  shuffled.push(unshuffled.delete_at(rand(unshuffled.length)))
  recursive_shuffle(unshuffled, shuffled)
end

Or even:

def recursive_shuffle(arr)
  return arr if arr.length == 1
  shuffle = [arr.sample]
  shuffle + recursive_shuffle(arr - shuffle)
end

Upvotes: 2

Cary Swoveland
Cary Swoveland

Reputation: 110685

Here's a more Ruby-like way you could implement your own versions of shuffle. The steps are as follows:

  • create the instance method my_shuffle on the class Array, to mimic the function of the instance method Array#shuffle (but not bother with implementing the alternative way shuffle can be used: shuffle(random: rng) → new_ary--see the doc).
  • operate on a duplicate of self (the array to be shuffled) so that self is not "mutated" (altered).
  • iterate on the enumerator enum = arr.size.times, which generates the sequence of numbers 0,1,...,arr.size-1. The values of the sequence are not used in the block; all that is important is that the block is executed arr.size times.
  • build an array represented by the block variable a by randomly deleting an element of arr and pushing it onto a.

Code

class Array
  def my_shuffle
    arr = self.dup
    arr.size.times.with_object([]) { |_,a| a << arr.delete_at(rand(arr.size)) }
  end
end

Example

[1,3,2,7,4,1].my_shuffle
  #=> [2, 3, 4, 1, 1, 7]

Explanation

Firstly, I've used Enumerator#each_object merely to save two steps (a = [] and a below). It produces the same results as the following:

  def my_shuffle
    arr = self.dup
    a = []
    arr.size.times { a << arr.delete_at(rand(arr.size)) }
    a
  end

The array to be shuffled is:

ar = [1,3,2,7,4,1]

then

arr = ar.dup
  #=> [1, 3, 2, 7, 4, 1] 
n = arr.size
  #=> 6 
enum0 = n.times
  #=> #<Enumerator: 6:times> 
enum1 = enum0.with_object([])
  #=> #<Enumerator: #<Enumerator: 6:times>:with_object([])>

Carefully examine the return value of enum1. It may be thought of as a "compound enumerator". We can see the elements that will generated by enum1 by converting it to an array:

enum1.to_a
  #=> [[0, []], [1, []], [2, []], [3, []], [4, []], [5, []]]

The second element of each array corresponds to the block variable a. Initially it is an empty array because Enumerable#each_with_object's argument is [].

We can send Enumerator#next to enum1 to generate the first element of the enumerator and set the block variables equal to that value:

b,a = enum1.next
  #=> [0, []] 

The values of b and a are obtained by the use of parallel assignment (aka multiple assignment):

b #=> 0 
a #=> []

We can then perform the block calculation:

c = arr.delete_at(rand(ar.size))
  #=> arr.delete_at(rand(6))
  #=> arr.delete_at(4) 
  #=> 4 
a << c
  #=> [4]

a is now [4], arr equals [1, 3, 2, 7, 1] and ar is unchanged.

As is customary, I have replaced the block variable b with an underscore because it is not used in the block calculation.

Next we have

b,a = enum1.next
  #=> [1, [4]] 
c = arr.delete_at(rand(ar.size))
  #=> arr.delete_at(rand(5))
  #=> arr.delete_at(0) 
  #=> 1 
a << c
  #=> [4, 1] 
arr
  #=> [3, 2, 7, 1]

The remaining calculations are similar.

Upvotes: 2

pjs
pjs

Reputation: 19855

If your goal to get the array shuffled you should use the builtin shuffle method, it's both correct and faster than anything you'll write.

If your goal is to implement your own version, this is not well-suited to recursion and you should implement a Fisher-Yates shuffle to avoid bias:

def fy_shuffle(arr)
  a_copy = arr.dup  # don't mutate original array!
  a_copy.each_index do |i|
    j = i + rand(a_copy.length - i)
    a_copy[i], a_copy[j] = a_copy[j], a_copy[i] if i != j
  end
end

If your goal is to learn about recursion, dinjas' answer fixes the problems in your original implementation and works as long as the arrays aren't large enough to blow out the stack.

Upvotes: 2

Related Questions