user3253255
user3253255

Reputation: 13

Ruby combining hashes in an array based on one hash value

I have an array of hashes that looks like:

[
  {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]

I'd like to combine hashes based on the id value while preserving it, preserve the name, and sum the net_worth and vehicles values.

So the final array would look like:

[
  {"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4},
  {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]

Upvotes: 0

Views: 379

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110645

Here are two ways of doing it that work with any number of key-value pairs, and do not depend on the names of keys (other than "id" and "name", of course, which are part of the specification).

Using update

This is a way that uses the form of Hash#update (akamerge!) that employs a block to determine the values of keys that are present in both hashes:

arr = [
  {"id"=>1, "name"=>"Batman",      "net_worth"=>100, "vehicles"=>2}, 
  {"id"=>1, "name"=>"Batman",      "net_worth"=>100, "vehicles"=>2}, 
  {"id"=>2, "name"=>"Superman",    "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]   

arr.each_with_object({}) { |g,h|
  h.update(g["id"]=>g.dup) { |_,oh,nh|
    oh.update(nh) { |k,ov,nv|
      (['id','name'].include?(k)) ? ov : ov+nv } } }.values
  #=> [{"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4}, 
  #    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  #    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100,"vehicles"=>2}]   

Using group_by

This could also be done by using Enumerable#group_by, as @maxd has done, but the following is a more compact and general implementation:

arr.map(&:dup).
    group_by { |row| row['id'] }.
    map { |_,arr|
      arr.reduce { |h, g|
        (g.keys - ['id','name']).each { |k| h[k] += g[k] }; h } }

  #=> [{"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4}, 
  #    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  #    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100,"vehicles"=>2}]   

arr.map(&:dup) is to avoid mutating arr. I used reduce without an argument to avoid the need for copying the key-value pairs having keys "id" and "name".

Upvotes: 1

Maxim
Maxim

Reputation: 9961

Here is solution of your problem. As you can see you should group rows by id and name, then calculate sum of other values and build result:

rows = [
    {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]

groups = rows.group_by {|row| [row['id'], row['name']] }

result = groups.map do |key, values|
  id, name = *key

  total_net_worth = values.reduce(0) {|sum, value| sum + value['net_worth'] }
  total_vehicles = values.reduce(0) {|sum, value| sum + value['vehicles'] }

  { "id" => id, "name" => name, "net_worth" => total_net_worth, "vehicles" => total_vehicles }
end

p result

Upvotes: 2

Related Questions