Scoala
Scoala

Reputation: 150

How can I merge duplicate elements of array while keeping the values combined?

I have something like this:

prods = [{"1050" => {"key" => "value", "key2" => "value2"}},
         {"1050" => {"key" => "value", "key2" => "value2"}},
         {"6650" => {"key" => "value", "key2" => "value2"}},
         {"6650" => {"key" => "value", "key2" => "value2"}}]

And I would like to merge the duplicates but keep the key value pairs like this

prods = [{"1050" => [{"key" => "value", "key2" => "value2"}, 
                     {"key" => "value", "key2" => "value2"}}],
         {"6650" => [{"key" => "value", "key2" => "value2"},
                     {"key" => "value", "key2" => "value2"}}]
        ]

would this be possible?

Upvotes: 1

Views: 731

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110675

This is one way among many that you could do that.

Code

def combine(prods)
  prods.map(&:flatten)
       .each_with_object(Hash.new {|h,k| h[k]=[]}) { |(k,v),h| h[k] << v }
       .map { |k,v| { k=>v } }
end

Examples

For your value of prods:

combine(prods)   
  #=> [{"1050"=>[{"key"=>"value", "key2"=>"value2"},
  #              {"key"=>"value", "key2"=>"value2"}]},
  #    {"6650"=>[{"key"=>"value", "key2"=>"value2"},
  #    {"key"=>"value", "key2"=>"value2"}]}]

Now let's redefine prods:

prods = [{"1050" => {"keya" => "value1", "keyb" => "value1"}},
         {"1050" => {"keya" => "value2", "keyb" => "value2"}},
         {"6650" => {"keya" => "value3", "keyb" => "value3"}},
         {"6650" => {"keya" => "value4", "keyb" => "value4"}}]
combine(prods)   
  #=> [{"1050"=>[{"keya"=>"value1", "keyb"=>"value1"},
  #              {"keya"=>"value2", "keyb"=>"value2"}]},
  #    {"6650"=>[{"keya"=>"value3", "keyb"=>"value3"},
 #               {"keya"=>"value4", "keyb"=>"value4"}]}] 

Explanation

These are the steps:

a = prods.map(&:flatten)
  #=> [["1050", {"key"=>"value", "key2"=>"value2"}],
  #    ["1050", {"key"=>"value", "key2"=>"value2"}],
  #    ["6650", {"key"=>"value", "key2"=>"value2"}],
  #    ["6650", {"key"=>"value", "key2"=>"value2"}]] 

h = a.each_with_object(Hash.new {|h,k| h[k]=[]}) { |(k,v),h| h[k] << v }
  #=> {"1050"=>[{"key"=>"value", "key2"=>"value2"},
  #             {"key"=>"value", "key2"=>"value2"}],
  #    "6650"=>[{"key"=>"value", "key2"=>"value2"},
  #             {"key"=>"value", "key2"=>"value2"}]} 

Lastly,

h.map { |k,v| { k=>v } }

produces the result shown above.

In computing h Enumerable#each_with_object's object is the value of the block variable h. Initially, h is an empty hash defined as follows:

Hash.new {|h,k| h[k]=[]}

The block gives the hash's default value. This says that if h is the hash and k is a key to be added to the hash, it's default value is an empty array. The first value of a passed to each_with_object's block is:

["1050", {"key"=>"value", "key2"=>"value2"}]

The block variables are therefore assigned as follows:

(k,v),h = [["1050", {"key"=>"value", "key2"=>"value2"}], {}]
  #=> [["1050", {"key"=>"value", "key2"=>"value2"}], {}] 
k #=> "1050" 
v #=> {"key"=>"value", "key2"=>"value2"} 
h #=> {} 

and the block calculation is:

h[k] << v

which is:

h["1050"] << {"key"=>"value", "key2"=>"value2"}

Since h does not have a key "1050", h["1050"] is first assigned its default value, an empty hash, so we have:

(h["1050"] = []) << {"key"=>"value", "key2"=>"value2"}

The hash h is now:

h #=> { "1050"=>[{"key"=>"value", "key2"=>"value2"}] }

The next value of a is passed to the block, causing the block variables being to be updated as follows:

(k,v),h = [["1050", {"key"=>"value", "key2"=>"value2"}],
           { "1050"=>[{"key"=>"value", "key2"=>"value2"}] }]
k #=> "1050" 
v #=> {"key"=>"value", "key2"=>"value2"} 
h #=> {"1050"=>[{"key"=>"value", "key2"=>"value2"}]} 

The block calculation is therefore:

h[k] << v
  # h["1050"] << {"key"=>"value", "key2"=>"value2"}

As h now has the key "1050" (whose value is an array), the default value is not used and the hash h becomes

h #=> {"1050"=>[{"key"=>"value", "key2"=>"value2"},
  #             {"key"=>"value", "key2"=>"value2"}]} 

The remaining calculations are performed similarly.

Upvotes: 1

Jesus Castello
Jesus Castello

Reputation: 1113

Here is my proposed solution:

results =
prods.each_with_object(Hash.new([])) do |hash, results|
  key    = hash.keys.first
  values = hash.values

  results[key] += values
end

results = results.map { |k, v| Hash[k, v] }

In this solution I just use a hash with a default value to handle the duplicates, then convert to the desired output format.


Alternative solution:

def find_hash(haystack, needle)
  haystack.index { |hay| hay.keys.first == needle }
end

results =
prods.each_with_object(Array.new) do |hash, results|
  key    = hash.keys.first
  values = hash.values

  idx = find_hash(results, key)

  if idx
    results[idx][key] += values
  else
    results << Hash[key, values]
  end
end

Here I try to find if a hash with a specified key already exist and then append to it, otherwise create a new hash and add it to the array.

Upvotes: 0

Brian
Brian

Reputation: 967

h = Hash.new {[]}   # this creates a new array when a key doesn't exist
prods.each do |prod|
  prod.each{ |key,val| h[key] = h[key] << val }
end
puts h

Upvotes: 0

Related Questions