Robin Wood
Robin Wood

Reputation: 41

Removing duplicates as well as the corresponding values from array in Ruby

I'm using Ruby 1.9.3 and I want to remove values from an array that appear more than once. I have the following:

arr = [1,2,2,3,4,5,6,6,7,8,9]

and the result should be:

arr = [1,3,4,5,7,8,9].

What would be the simplest, shortest Ruby code to accomplish this?

Upvotes: 0

Views: 90

Answers (5)

Cary Swoveland
Cary Swoveland

Reputation: 110755

I would be inclined to use a counting hash.

Code

def single_instances(arr) 
  arr.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }.
      select { |_,v| v == 1 }.
      keys
end

Example

single_instances [1,2,2,3,4,5,6,6,7,8,9]
  #=> [1, 3, 4, 5, 7, 8, 9]

Explanation

The steps are as follows.

arr = [1,2,2,3,4,5,6,6,7,8,9]

f = Hash.new(0)
  #=> {}

f is created with the method Hash::new with an argument of zero. That means that if f does not have a key k, f[k] returns zero (and does not alter f).

enum = arr.each_with_object(f)
  #=> #<Enumerator: [1, 2, 2, 3, 4, 5, 6, 6, 7, 8, 9]:each_with_object({})>
h = enum.each { |e,h| h[e] += 1 }
  #=> {1=>1, 2=>2, 3=>1, 4=>1, 5=>1, 6=>2, 7=>1, 8=>1, 9=>1}
g = h.select { |_,v| v == 1 }
  #=> {1=>1, 3=>1, 4=>1, 5=>1, 7=>1, 8=>1, 9=>1}
g.keys
  #=> [1, 3, 4, 5, 7, 8, 9]

In calculating g, Hash#select (which returns a hash), not Enumerable#select (which returns an array), is executed. I've used an underscore for the first block variable (a key in h) to signify that it is not used in the block calculation.

Let's look more carefully at the calculation of h. The first value is generated by the enumerator enum and passed to the block, and the block variables are assigned values using a process called disambiguation or decomposition.

e, h = enum.next
  #=> [1, {}]
e #=> 1
h #=> {}

so the block calculation is

h[e] += 1
  #=> h[e] = h[e] + 1 => 0 + 1 => 1

h[e] on the right side of the equality (using the method Hash#[], as contrasted with Hash#[]= on the left side of the equality), returns 1 because h has no key e #=> 1.

The next two elements of enum are passed to the block and the following calculations are performed.

e, h = enum.next
  #=> [2, {1=>1}]
h[e] += 1
  #=> h[e] = h[2] + 1 => 0 + 1 => 1

Notice that h has been updated.

e, h = enum.next
  #=> [2, {1=>1, 2=>1}]
h[e] += 1
  #=> h[e] = h[e] + 1 => h[2] + 1 => 1 + 1 => 2 
h #=> {1=>1, 2=>2}

This time, because h already has a key e #=> 2, the hash's default value is not used.

The remaining calculations are similar.

Use [Array#difference] instead

A simpler way is to use the method Array#difference.

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

Suppose

arr = [1,2,2,3,4,2,5,6,6,7,8,9]

Note the addition of a third 2.

arr - arr.difference(arr.uniq)
  # => [1, 3, 4, 5, 7, 8, 9]

The three steps are as follows.

a = arr.uniq
  #=> [1, 2, 3, 4, 5, 6, 7, 8, 9]
b = arr.difference(a)
  #=> [2, 2, 6] (elements that appear more than once)
arr - b
  # => [1, 3, 4, 5, 7, 8, 9]

I've proposed that Array#diffence be added to the Ruby core, but there seems to be little interest in doing so.

Upvotes: 0

Gagan Gami
Gagan Gami

Reputation: 10251

I want to remove values from an array that appear more than once.

below is an example:

> arr.delete_if{|e| arr.count(e) > 1}
#=> [1, 3, 4, 5, 7, 8, 9]

Option2:

> arr.group_by{|e| e}.delete_if{|_,v| v.size > 1}.keys
#=> [1, 3, 4, 5, 7, 8, 9]

First of you need to group elements by itself (which will return key, value pair), then remove such elements which appear more than once(value), and use keys

Upvotes: 0

Nitin Srivastava
Nitin Srivastava

Reputation: 1424

We can achieve this by array select and count methods

arr.select { |x| arr.count(x) == 1 } #=> [1, 3, 4, 5, 7, 8, 9]

Upvotes: 2

l.g.karolos
l.g.karolos

Reputation: 1142

def find_duplicates(elements)
    encountered = {}

    # Examine all elements in the array.
    elements.each do |e|
        # If the element is in the hash, it is a duplicate.
        if encountered[e]
            #Remove the element
        else
            # Record that the element was encountered.
            encountered[e] = 1
        end
    end
end

Upvotes: 0

Rohit Kumar
Rohit Kumar

Reputation: 71

As @Sergio Tulentsev mentioned combination of group_by and select will do the trick Here you go

arr.group_by{|i| i}.select{|k, v| v.count.eql?(1)}.keys

Upvotes: 2

Related Questions