KyKy
KyKy

Reputation: 53

Ruby - How to remove duplicates in array of hashes?

I have an array of hashes within an array of hashes. I'd like to remove duplicates based on the values of the inner arrays.

topics = [{"defense"=>
   [{:id=>30,
     :source=>"Hacker News",
     :title=>
      "China-based campaign breached satellite, defense companies: Symantec",
     :link=>
      "https://www.reuters.com/article/us-china-usa-cyber/china-based-campaign-breached-satellite-defense-companies-symantec-idUSKBN1JF2X0"}]},
 {"companies"=>
   [{:id=>30,
     :source=>"Hacker News",
     :title=>
      "China-based campaign breached satellite, defense companies: Symantec",
     :link=>
      "https://www.reuters.com/article/us-china-usa-cyber/china-based-campaign-breached-satellite-defense-companies-symantec-idUSKBN1JF2X0"}]},
 {"Symantec"=>
   [{:id=>30,
     :source=>"Hacker News",
     :title=>
      "China-based campaign breached satellite, defense companies: Symantec",
     :link=>
      "https://www.reuters.com/article/us-china-usa-cyber/china-based-campaign-breached-satellite-defense-companies-symantec-idUSKBN1JF2X0"}]}]

topics.uniq { |phrase, post| post }
puts topics

You can see above that the phrases defense, companies, and Symantec each contain identical arrays. How can I keep only the first hash that contains one of the identical arrays?

Expected output:

{"defense"=>
  [{:id=>30,
    :source=>"Hacker News",
    :title=>
     "China-based campaign breached satellite, defense companies: Symantec",
    :link=>
     "https://www.reuters.com/article/us-china-usa-cyber/china-based-campaign-breached-satellite-defense-companies-symantec-idUSKBN1JF2X0"}]}

Note: in the above example each inner array of "phrases" only contains one hash, but in the application it could contain several posts.

Upvotes: 2

Views: 1633

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110755

topics = [
  { "defense"   => [{ id: 30, source: "Hacker", title: "China", link: "F2X0"}] },
  { "companies" => [{ id: 30, source: "Hacker", title: "China", link: "F2X0"}] },
  { "Symantec"  => [{ id: 30, source: "Hacker", title: "China", link: "F2X0"}] }
]

topics.uniq { |h| h.values }
  #=> [{"defense"=>[{:id=>30, :source=>"Hacker", :title=>"China", :link=>"F2X0"}]}]

See Array#uniq for the case when uniq employs a block. Note the sentence, "self is traversed in order, and the first occurrence is kept."

Upvotes: 0

iGian
iGian

Reputation: 11203

With this solution you get only the array:

topics.map { |topic| topic.values }.uniq.flatten 

It returns just:

# => [{:id=>30, :source=>"Hacker News", :title=>"China-based campaign breached satellite, defense companies: Symantec", :link=>"https://www.reuters.com/article/us-china-usa-cyber/china-based-campaign-breached-satellite-defense-companies-symantec-idUSKBN1JF2X0"}]

Upvotes: 0

matthewd
matthewd

Reputation: 4435

topics.invert.invert will reduce the hash to a single (arbitrarily-chosen) key for each unique value.

Upvotes: 3

Related Questions