Reputation: 5928
I have an array
a = ['A', 'B', 'B', 'C', 'D', 'D']
and I have to go thru all the elements, do something depending on whether the is the last occurance or not, and remove the element after processing it.
The elements are already sorted if that matters.
I'm looking for something efficient. Any suggestions?
Her what I have until now. THIS WORKS AS EXPECTED but not sure it is very efficient.
a = ['A', 'B', 'B', 'C', 'D', 'D']
while !a.empty?
b = a.shift
unless a.count(b) > 0
p "unique #{b}"
else
p "duplicate #{b}"
end
end
and it produces
"unique A"
"duplicate B"
"unique B"
"unique C"
"duplicate D"
"unique D"
Thanks
Upvotes: 1
Views: 102
Reputation: 110675
The OP stated that the elements of a
are sorted, but that is not required by the method I propose. It also maintains array-order, which could be important for the "do something" code performed for each element to be removed. It does so with no performance penalty over the case where the array is already sorted.
For the array
['A', 'B', 'D', 'C', 'B', 'D']
I assume that some code is to be executed for 'A'
, 'C'
the second 'B'
and the second 'D'
, in that order, after which a new array
['B', 'D']
is returned.
Code
def do_something(e) end
def process_last_dup(a)
a.dup.
tap do |b|
b.each_with_index.
reverse_each.
uniq(&:first).
reverse_each { |_,i| do_something(a[i]) }.
each { |_,i| b.delete_at(i) }
end
end
Example
a = ['A', 'B', 'B', 'C', 'D', 'D']
process_last_dup(a)
#=> ["B", "D"]
Explanation
The steps are as follows.
b = a.dup
#=> ["A", "B", "B", "C", "D", "D"]
c = b.each_with_index
#=> #<Enumerator: ["A", "B", "B", "C", "D", "D"]:each_with_index>
d = c.reverse_each
#=> #<Enumerator: #<Enumerator: ["A",..., "D"]:each_with_index>:reverse_each>
Notice that d
can be thought of as a "compound" enumerator. We can convert it to an array to see the elements it will generate and pass to uniq
.
d.to_a
#=> [["D", 5], ["D", 4], ["C", 3], ["B", 2], ["B", 1], ["A", 0]]
Continuing,
e = d.uniq(&:first)
#=> [["D", 5], ["C", 3], ["B", 2], ["A", 0]]
e.reverse_each { |_,i| do_something(a[i]) }
reverse_each
is used so that do_something
is first executed for 'A'
, then for the second 'B'
, and so on.
e.each { |_,i| b.delete_at(i) }
b #=> ["B", "D"]
If a
is to be modified in place replace a.dup.
with a.
.
Readers may have noticed that the code I gave at the beginning used Object#tap so that tap
's block variable b
, which initially equals a.dup
, will be returned after it has been modified within tap
's block, rather than explicitly setting b = a.sup
at the beginning and b
at the end, as I've done in my step-by-step explanation. Both approaches yield the same result, of course.
The doc for Enumerable#uniq does not specify whether the first element is kept, but it does reference Array.uniq, which does keep the first. If there is any uneasiness about that one could always replace reverse_each
with reverse
so that Array.uniq
would be used.
Upvotes: 1
Reputation: 7685
Quite similar to the answer of @GaganGami but using chunk_while
.
a.chunk_while { |a,b| a == b }
.each do |*list,last|
list.each { |e| puts "duplicate #{e}" }
puts "unique #{last}"
end
chunk_while
splits the array into sub arrays when the element changes.
['A', 'B', 'B', 'C', 'D', 'D'].chunk_while { |a,b| a == b }.to_a
# => [["A"], ["B", "B"], ["C"], ["D", "D"]]
Upvotes: 1
Reputation: 10251
Simple way:
array = ["A", "B", "B", "C", "D", "D"]
array.group_by{|e| e}.each do |key,value|
*duplicate, uniq = value
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
As per Stefan's comment and suggestion, shorter way is:
array.chunk_while(&:==).each do |*duplicate, uniq|
duplicate.map do |e|
puts "Duplicate #{e}"
end
puts "Unique #{uniq}"
end
# Above both will give the same Output:
---------------------------------------
Unique A
Duplicate B
Unique B
Unique C
Duplicate D
Unique D
Upvotes: 4
Reputation: 60143
Based on your code and expected output, I think this is an efficient way to do what you're looking for:
a = ['A', 'B', 'B', 'C', 'D', 'D']
a.each_index do |i|
if i < a.length - 1 && a[i+1] == a[i]
puts "This is not the last occurrence of #{a[i]}"
else
puts "This is the last occurrence of #{a[i]}"
end
end
# Output:
# This is the last occurrence of A
# This is not the last occurrence of B
# This is the last occurrence of B
# This is the last occurrence of C
# This is not the last occurrence of D
# This is the last occurrence of D
But I want to reiterate the importance of the wording in my output versus yours. This is not about whether the value is unique or not in the input. It seems to be about whether the value is the last occurrence within the input or not.
Upvotes: 1