Reputation: 13
New to coding, and trying to study independently. Working on a problem on Ruby, where we create a method (filter_out!) that mutates the original array to remove true elements when the array is passed through a proc. The only stipulation is that we aren't allowed to use Array#reject!.
Can anyone explain to me why
arr.uniq.each { |ele| arr.delete(ele) if prc.call(ele)}
works and
arr.each { |ele| arr.delete(ele) if prc.call(ele)}
does not? Here are two example problems:
arr_2 = [1, 7, 3, 5 ]
filter_out!(arr_2) { |x| x.odd? }
p arr_2 # []
arr_3 = [10, 6, 3, 2, 5 ]
filter_out!(arr_3) { |x| x.even? }
p arr_3 # [3, 5]
I've looked it up and understand #uniq removes duplicate values, right?
Any insight would be greatly appreciated-thank you!
edit: Looked into it more, is it because uniq will use the return value of the proc for comparison? Still not sure why this is needed for some numbers, but not all.
Upvotes: 1
Views: 112
Reputation: 110685
The behaviour you have witnessed has nothing to do with the fact that a proc is present. The problem is that you are invoking a method (Array#each
) on a receiver (the array) that is altered (mutated) in the method's block.
Suppose:
arr = [true, false, true]
arr.each do |x| puts "x = #{x}, arr = #{arr}"
arr.delete(x)
puts "arr = #{arr}"
end
#=> [false]
The following is printed:
x = true, arr = [true, false, true]
arr = [false]
You were perhaps expecting to see an empty array returned, not [false]
.
As you see from the printed output, only arr[0] #=> true
was passed to each
's block. That caused arr.delete(x)
to delete both instances of true
from arr
, causing arr
to then equal [false]
. Ruby then attempts to pass arr[1]
to the block but finds that arr
has only one element so she concludes she is finished and returns arr #=> [false]
.
Had arr = [true, true, true]
the return value (an empty array) would be correct, but only because all elements are removed when the first element of arr
is passed to the block. This is analogous to the example in the question of removing all odd elements of [1, 7, 3, 5]
.
Even when the code is technically correct it is considered bad practice to invoke a method on a receiver that alters the receiver in its block.
If we first invoke uniq
we are no longer iterating over the array we are modifying, so all is well:
a = arr.uniq
#=> [true, false]
a.each do |x| puts "x = #{x}, arr = #{arr}"
arr.delete(x)
puts "arr = #{arr}"
end
#=> [true, false]
The following is printed:
x = true, arr = [true, false, true]
arr = [false]
x = false, arr = [false]
arr = []
Similarly,
a = arr.dup
#=> [true, false]
a.each do |x| puts "x = #{x}, arr = #{arr}"
arr.delete(x)
puts "arr = #{arr}"
end
#=> [true, false, true]
The following is printed:
x = true, arr = [true, false, true]
arr = [false]
x = false, arr = [false]
arr = []
x = true, arr = []
arr = []
Upvotes: 1
Reputation: 369468
The only stipulation is that we aren't allowed to use Array#reject!.
This is an incredibly braindead stipulation. You could just use Array#select!
and invert the condition, or use Array#reject
together with Array#replace
, for example.
Can anyone explain to me why
arr.uniq.each { |ele| arr.delete(ele) if prc.call(ele)}
works and
arr.each { |ele| arr.delete(ele) if prc.call(ele)}
Whenever you don't quite understand what a piece of code does, a good idea is to just run it yourself with a pen and a piece of paper.
So, let's just do that. Assume arr = [1, 2, 4, 5, 2]
and prc = -> e { e.even? }
.
Array#each
iterates through the array, we don't exactly know how it does it (that is the whole idea of abstraction), but we can imagine that it keeps some kind of index to remember which part of the array it is currently at.
So, during the first iteration of the array, the index is at the first element, and the array looks like this:
[1, 2, 4, 5, 2]
# ↑
Array#each
passes 1
to the block, which in turn passes it to prc
, which returns false
, thus the block doesn't do anything.
Array#each
increases the index, so now our array looks like this:
[1, 2, 4, 5, 2]
# ↑
Array#each
passes 2
to the block, which in turn passes it to prc
, which returns true
. As a result, the block now passes 2
to Array#delete
, which deletes every element from the array that is equal to 2
. So, now the array looks like this:
[1, 4, 5]
# ↑
Array#each
increases the index, so now our array looks like this:
[1, 4, 5]
# ↑
Array#each
passes 5
to the block, which in turn passes it to prc
, which returns false
, thus the block doesn't do anything.
Array#each
increases the index, so now our array looks like this:
[1, 4, 5]
# ↑
Since index is past the end of the array, the iteration is done, and the result is [1, 4, 5]
.
As you can see, the problem is that you are mutating the array while iterating over it.
I've looked it up and understand #uniq removes duplicate values, right?
Correct, but that has nothing to do with why it makes your code work. Or, more precisely, makes it seem to work because there are actually a lot of other problems with your code as well.
It is not the fact that Array#uniq
removes duplicate values that is relevant, it is the fact that Array#uniq
returns a new array. Since Array#uniq
returns a new array, you are no longer mutating an array while iterating over it, you are mutating one array while iterating over another.
You could have used any method of Array
or one of its ancestors that returns a new array, for example Object#clone
, Object#dup
, Array#map
, or Array#select
, or even something really creative such as arr + []
:
arr.select { true }.each {|ele| arr.delete(ele) if prc.call(ele) }
In fact, you don't even need to return an Array
, you only need to return some kind of Enumerable
, such as an Enumerator
, for example using Array#each
:
arr.each.each {|ele| arr.delete(ele) if prc.call(ele) }
edit: Looked into it more, is it because uniq will use the return value of the proc for comparison? Still not sure why this is needed for some numbers, but not all.
Since you are not passing the proc to Array#uniq
, it cannot possibly use it for anything, so it is clearly impossible for this to be the explanation.
Note that, as I explained above, there are more problems with your code than just mutating an array while iterating over it.
Even if your original code did work, it would actually still be broken. You are using Array#delete
to delete the element for which prc
returns true
. The problem is that Array#delete
deletes all elements in the array that are #==
to the element you are passing as an argument, even the ones for which the prc
might return false
.
Here is a trivial example:
a = [2, 2, 2]
i = -1
filter_out!(a) { (i += 1) == 1 }
This should only filter out the second element, but actually deletes all of them, so the result is []
when it actually should be [2, 2]
.
Your version with Array#uniq
makes this problem even worse, because after running Array#uniq
, the array doesn't even have a second element any more! So, the result is [2]
, when it actually should be [2, 2]
.
The next problem is the name of the method: filter_out!
. Bang methods (i.e. methods whose name ends with an exclamation mark !
) are used to mark the more surprising method of a pair of methods. You should never have a bang method on its own. The name filter_out!
should only be used if and only if there is also a filter_out
method. So, the method should be named filter_out
.
And the last problem is that you are mutating the argument that is passed into the method. That is an absolute no-no. You never, ever mutate an argument that is passed into a method. Never. You never break somebody else's toys, you never touch somebody else's privates.
Mutation, in general, should be avoided as much as possible, since it can lead to confusing and hard to track down bugs, as you so beautifully have discovered yourself. Only if absolutely necessary should you mutate state, and only state that you yourself own. Never mutate somebody else's state. Instead, filter_out
should return a new array with the results.
Here are a couple of examples what filter_out
could look like:
def filter_out(enum)
enum.select {|el| !yield el }
end
def filter_out(enum)
enum.reduce([]) {|acc, el| if yield el then acc else acc + [el] end }
end
def filter_out(enum)
enum.each_with_object([]) {|el, acc| acc << el unless yield el }
end
Note: Personally, I am not a big fan of Enumerable#each_with_object
, because it relies on mutation, so I would avoid that solution as much as possible.
However, the real problem seems to be that whatever course or book or tutorial or class you are following seems to be pretty terrible, both a teaching Ruby, and at teaching good Software Engineering practices, such as testing and debugging.
Upvotes: 3