GhostRider
GhostRider

Reputation: 2170

Ruby: array of hashes - how to remove duplicates based on the hash key which is an array

I have an array of hashes in which the key of each hash is and array containing 2 integers - look like this:

  [{[6, 8]=>0.5932190854209105}, {[6, 13]=>0.7183325285691291}, {[6, 15]=>0.8253727388780498}, {[8, 6]=>0.5932190854209105}, {[8, 13]=>0.7255537819950661}, {[8, 15]=>0.5249232568337963}, {[13, 6]=>0.7183325285691291}, {[13, 8]=>0.7255537819950661}, {[13, 15]=>0.6348636166265346}, {[15, 6]=>0.8253727388780497}, {[15, 8]=>0.5249232568337963}, {[15, 13]=>0.6348636166265343}]

I need to remove duplicates - in this case a duplicate is defined as a hash whose key already exist (but in reverse order). So for example [6, 15] and [15,6]. You can see that based on this definition half of these are duplicates.

Just to add to this:

This is formed from the following

 @user_array.each do |u|
   @result << @user_array.map { |p| Hash[[u, p] => kappa(u, p, "ipf")] if p !=u  }
 end

user_array is an array of integers (user ids). For example :

  [6, 8, 13, 15]

I need to run the kappa helper on each unorder paired combination. I can seem to work out how to prevent it "doubling up". I figured if I could save the pair somehow then I could make comparisons. The only way I knew how to do that is by using a hash. I am fairly new.

EDIT: I tried sort like this:

@user_array.each do |u|
   @result << @user_array.map { |p| Hash[[u, p].sort => kappa(u, p, "ipf")] if p !=u  }
end

But they are discrete hashes...so it doesn't work:

 [{[6, 8]=>0.5932190854209105}, {[6, 13]=>0.7183325285691291}, {[6, 15]=>0.8253727388780498}, {[6, 8]=>0.5932190854209105}, {[8, 13]=>0.7255537819950661}, {[8, 15]=>0.5249232568337963}, {[6, 13]=>0.7183325285691291}, {[8, 13]=>0.7255537819950661}, {[13, 15]=>0.6348636166265346}, {[6, 15]=>0.8253727388780497}, {[8, 15]=>0.5249232568337963}, {[13, 15]=>0.6348636166265343}]

Its not that straightforward.

Upvotes: 2

Views: 917

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110675

In a comment the OP has clarified that if an element (hash) of the array is to be retained, and the (only) key of that hash is [a,b], no subsequent hash with key [a,b] or [b,a] is to be retained.

Let arr denote your array of hashes, each having a single key/value pair.

You can use Enumerable#uniq or Enumerable#uniq, depending on whether arr is to be modified in place.

arr.uniq { |h| h.first.first.sort }
  #=> [{[6, 8]=>0.5932190854209105}, {[6, 13]=>0.7183325285691291},
  #    {[6, 15]=>0.8253727388780498}, {[8, 13]=>0.7255537819950661},
  #    {[8, 15]=>0.5249232568337963}, {[13, 15]=>0.6348636166265346}] 

or, to modify arr in place,

arr.uniq! { |h| h.first.first.sort } || arr
  #=> [{[6, 8]=>0.5932190854209105}, {[6, 13]=>0.7183325285691291},
  #    {[6, 15]=>0.8253727388780498}, {[8, 13]=>0.7255537819950661},
  #    {[8, 15]=>0.5249232568337963}, {[13, 15]=>0.6348636166265346}] 
arr
  #=> [{[6, 8]=>0.5932190854209105}, {[6, 13]=>0.7183325285691291},
  #    {[6, 15]=>0.8253727388780498}, {[8, 13]=>0.7255537819950661},
  #    {[8, 15]=>0.5249232568337963}, {[13, 15]=>0.6348636166265346}] 

|| arr is needed in case arr contains no duplicates, in which case uniq! returns nil.

You could alternatively write

require 'set'
arr.uniq { |h| h.first.first.to_set }

(or uniq!).

To quote from uniq's doc, "self is traversed in order, and the first occurrence is kept."

Upvotes: 0

tadman
tadman

Reputation: 211580

So long as your kappa function produces the same value for u,p as for p,u then you can do this:

@result = @user_array.each_with_object({ }) do |u, h|
  @user_array.each do |p|
    next if (u == p)

    h[[u, p].sort] ||= kappa(u, p, "ipf")
  end
end

That populates the values once and once only. If you want to do it where the last value sticks then change ||= to =.

Upvotes: 3

Makoto
Makoto

Reputation: 106430

It seems like you could head this off at the pass if you sorted the arrays. Since you're stating that any permutation of the pairs is equivalent, then a sort before insertion would allow the hash to eliminate/overwrite any duplicate values.

@user_array.each do |u|
   @result << @user_array.map { |p| Hash[[u, p].sort => kappa(u, p, "ipf")] if p !=u  }
 end

Upvotes: 2

Related Questions