Reputation: 534

Merge two arrays of hashes based on comparison of multiple keys

I have two arrays of hashes:

a1 = [{ ID: 12496, name: "Robert", email: "[email protected]" }, ...]
a2 = [{ ID: 12496, name: "Robert", ORDER_NO: 5511426 }, ...]

I would like to find the hashes in a2 whose ID and name fields match the ID and name fields of an entry in a1 (without caring about email or any other items that make their way into a2), and then merge the value of ORDER_NO into the a1 hash at those values. i.e. end up with:

[{ ID: 12496, name: "Robert", email: "[email protected]", ORDER_NO: 5511426 } ...]

Also I want to ignore elements present in a2 but not in a1.

I'm doing the following:

a1.each do |a1_hash|
  matching_hash = a2.find { |a2_hash| data_matches?(a1_hash, a2_hash) } if a2.present?
  a1_hash["ORDER_NO"] = a2_hash["ORDER_NO"] if matching_hash.present?
  a2.delete(a2_hash)
end

but is there a faster way?

Upvotes: 1

Answers (3)

Ajedi32

Reputation: 48368

This can be done quite cleanly using a few of Ruby's built-in methods.

a1 = [{ ID: 12496, name: "Robert", email: "[email protected]" },
      { ID: 12497, name: "Lola",   email: "[email protected]" },
      { ID: 12498, name: "Hank",   email: "[email protected]" }]

a2 = [{ ID: 12497, name: "Lola",   ORDER_NO: 5511427 },
      { ID: 12496, name: "Robert", ORDER_NO: 5511426 }]

index = a2.group_by{|entry| [entry[:ID], entry[:name]] }
a1.map{|entry| (index[[entry[:ID], entry[:name]]] || []).reduce(entry, :merge) }

Result:

[{:ID=>12496, :name=>"Robert", :email=>"[email protected]", :ORDER_NO=>5511426},
 {:ID=>12497, :name=>"Lola",   :email=>"[email protected]",   :ORDER_NO=>5511427},
 {:ID=>12498, :name=>"Hank",   :email=>"[email protected]"}]

Breakdown:

First, we use group_by to build a table of the entries in a2 that could potentially be merged into entries in a1. We index this table on the id and name keys, since those are the factors we're using to determine which entries match:

index = a2.group_by{|entry| [entry[:ID], entry[:name]] }

This produces the result:

{[12497, "Lola"]=>[{:ID=>12497,   :name=>"Lola",   :ORDER_NO=>5511427}], 
 [12496, "Robert"]=>[{:ID=>12496, :name=>"Robert", :ORDER_NO=>5511426}]}

Next, we map each entry in a1 to its new form, with the order numbers in the index merged:

a1.map{|entry|
  # ...
}

To get the value we're mapping each entry to, we start by getting an array containing all the values in a2 which are suitable to merge with this entry from a1:

(index[[entry[:ID], entry[:name]]] || [])

This will return something like [{:ID=>12497, :name=>"Lola", :ORDER_NO=>5511427}] for Lola, and an empty array for Hank, who has no matching entry in a2.

Then, starting from the entry from a1, we reduce all the entries from the index to one hash using merge (e.g. reduce(entry, :merge)), which results in an entry like {:ID=>12496, :name=>"Robert", :email=>"[email protected]", :ORDER_NO=>5511426}.

All this might seem a bit complicated if you're unfamiliar with the methods in Ruby's core library. But once you understand simple functional programing concepts like map and reduce, it's really not all that difficult to come up with simple and powerful solutions like this.

Upvotes: 2

Cary Swoveland

Reputation: 110675

Suppose:

a1 = [{ ID: 12496, name: "Robert", email: "[email protected]" },
      { ID: 12497, name: "Lola",   email: "[email protected]" },
      { ID: 12498, name: "Hank",   email: "[email protected]" }]

a2 = [{ ID: 12497, name: "Lola",   ORDER_NO: 5511427 },
      { ID: 12496, name: "Robert", ORDER_NO: 5511426 }]

I suggest you first construct the hash:

h2 = a2.each_with_object({}) { |g,h| h[[g[:ID], g[:name]]]=g[:ORDER_NO] }
  #=> { [12497, "Lola"]=>5511427, [12496, "Robert"]=>5511426 }

then simply step through the elements of a1, adding key-value pairs where appropriate:

a1.each do |g|
  k = [g[:ID],g[:name]]
  g[:ORDER_NO] = h2[k] if h2.key?(k)
end
a1
  #=> [{ID: 12496, name: "Robert", email: "[email protected]", ORDER_NO: 5511426},
  #    {ID: 12497, name: "Lola",   email: "[email protected]",   ORDER_NO: 5511427},
  #    {ID: 12498, name: "Hank",   email: "[email protected]"}]

I have assumed:

no two elements (hashes) in a1 have the same values for both ID and :name;
no two elements (hashes) in a2 have the same values for both ID and :name; and
a1 is to be muated.

Upvotes: 1

davidrac

Reputation: 10738

You can do it faster by putting things in a hash by the requested attributes to merge before merging and then getting the values (of course there are assumptions re. the uniqueness of values in the input).

x1 = a1.reduce({}){|m, h| m[h.select{|k| [:ID, :name].include? k}] = h;m}
x2 = a2.reduce({}){|m, h| m[h.select{|k| [:ID, :name].include? k}] = h;m}
x1.merge(x2.select{|k,v| x1.key?(k)}){|k,o,n| o.merge(n)}.values

Running with your example data:

a1 = [{ ID: 12496, name: "Robert", email: "[email protected]" }]
=> [{:ID=>12496, :name=>"Robert", :email=>"[email protected]"}]

a2 = [{ ID: 12496, name: "Robert", ORDER_NO: 5511426 }]
=> [{:ID=>12496, :name=>"Robert", :ORDER_NO=>5511426}]

x1 = a1.reduce({}){|m, h| m[h.select{|k| [:ID, :name].include? k}] = h;m}
=> {{:ID=>12496, :name=>"Robert"}=>{:ID=>12496, :name=>"Robert", :email=>"[email protected]"}}

x2 = a2.reduce({}){|m, h| m[h.select{|k| [:ID, :name].include? k}] = h;m}
=> {{:ID=>12496, :name=>"Robert"}=>{:ID=>12496, :name=>"Robert", :ORDER_NO=>5511426}}

x1.merge(x2.select{|k,v| x1.key?(k)}){|k,o,n| o.merge(n)}.values
=> [{:ID=>12496, :name=>"Robert", :email=>"[email protected]", :ORDER_NO=>5511426}]

Upvotes: 0

Merge two arrays of hashes based on comparison of multiple keys

Answers (3)

Related Questions