Reputation: 1325
I am trying to remove all 7
s from an array:
a = [1,2,3,4,5,7,7,7,7,7,7,7,7,7,9,10,11,12,13]
I did:
a.each_with_index do |item, index|
if item == 7
a.delete_at(index)
end
end
a # => [1, 2, 3, 4, 5, 7, 7, 7, 7, 9, 10, 11, 12, 13]
How did this happen?
Upvotes: 4
Views: 224
Reputation: 881503
The fact that only about half (5/9) of your items have disappeared is a dead giveaway that the problem is deleting while iterating over the collection.
The iteration will be processing indexes 1, 2, 3, 4 and so on. If you delete index 2 while processing it, that will shift all later indexes down by one.
So, when you then move on to index 3 in the next iteration, you will skip the original index 3 because that will have been shifted down to index 2.
In other words, let's start with a simpler example with two consecutive items to remove:
index | 0 | 1 | 2 | 3 |
value | 1 | 7 | 7 | 9 |
You check the first index and the value is 1
so you do nothing. You then check the second index and the value is 7
so you delete it, giving:
index | 0 | 1 | 2 |
value | 1 | 7 | 9 |
You then check the third index and the value is 9
so you do nothing. You've also reached the end so it stops.
So you can see there that you actually skipped the second item you wanted to delete, because you shifted things around while iterating. This isn't a Ruby-specific problem, a great many languages have this same issue.
In general, each complete pair of adjacent items will only have the first of the pair deleted while an item on its own (not followed by another of the same value) will be deleted normally. That's why only 5/9
of your 7
s are deleted, one for each of the four pairs and the final standalone one.
The correct way (in Ruby) to delete all items of a single given value is to use the array delete method:
a.delete(7)
You can also use conditional delete for more complex conditions such as deleting everything greater than 7
:
a.delete_if {|val| val > 7}
And, if you really want to do it yourself (as an educational exercise), you just have to realise that the problem is because you process the array in a forward manner - when you do that, changes beyond where you're deleting may cause issues.
If you were to find some way to process the array in a backwards manner, this issue would not occur. Luckily, Ruby has such a beast:
a.to_enum.with_index.reverse_each do |item, index|
That line will process the array in such a way that deletions will not affect future operations. Just be aware that deleting while iterating can still be a problem if the data structure you're processing is not a simple indexed array.
I still warrant that delete
and delete_if
are the correct way to go since they're baked into Ruby already, and therefore incredibly unlikely to have bugs.
Upvotes: 7
Reputation: 164829
In general, deleting from the Array you're iterating over, in any language, is going to get weird. To see why, look at the code for Array#each_index
.
VALUE rb_ary_each_index(VALUE ary)
{
long i;
RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
for (i=0; i<RARRAY_LEN(ary); i++) {
rb_yield(LONG2NUM(i));
}
return ary;
}
Note that it's really a for
loop from 0 to the length of the array. The loop will just count from 0 to the original length of the array.
When you delete an item from the array, everything after it shifts over 1. If i = 5
and you call a.delete_at(i)
that means a[6]
is now a[5]
. a[7]
is now a[6]
. And so on. Next iteration will have i = 6
which means you effectively skipped an element.
To illustrate, and assuming you want to delete 2.
i = 0
a = [1,2,2,2,3,4,5]
^
-------------------
i = 1
a = [1,2,2,2,3,4,5]
^
a.delete_at(i)
a = [1,2,2,3,4,5]
-------------------
i = 2
a = [1,2,2,3,4,5]
^
a.delete_at(i)
a = [1,2,3,4,5]
-------------------
i = 3
a = [1,2,3,4,5]
^
-------------------
i = 4
a = [1,2,3,4,5]
^
-------------------
i = 5
a = [1,2,3,4,5]
-------------------
i = 6
a = [1,2,3,4,5]
Note that the last two iterations have walked off the array, because the array is now two elements smaller than it was before.
Upvotes: 2
Reputation: 1624
This happens because in each iteration you are updating/deleting the same array which you are looping through.
For this purpose, you should use this delete method:
a = [1,2,3,4,5,7,7,7,7,7,7,7,7,7,9,10,11,12,13]
a.delete(7)
# [1, 2, 3, 4, 5, 9, 10, 11, 12, 13]
You can see the problem by running this program:
a = [1,2,3,4,5,6]
a.each_with_index do |item, index|
puts "#{index} : #{item}"
if item == 4
a.delete_at(index)
end
end
Output:
0 : 1
1 : 2
2 : 3
3 : 4 # made delete here
4 : 6 # See the problem !
Hope it helps :)
Upvotes: 3