Reputation: 55
arr1 = [
{entity_type: "Mac", entity_ids: [3], cascade_id: 2, location_id: 1},
{entity_type: "Mac", entity_ids: [2], cascade_id: 2, location_id: 1},
{entity_type: "Mac", entity_ids: [9], cascade_id: 4, location_id: 1},
{entity_type: "Mac", entity_ids: [10], cascade_id: 4, location_id: 1}
]
This is the part of data, that I get after some of my logical iterations. My desired output here for this example is
[{entity_type: "Mac", entity_ids: [3,2], cascade_id: 2, location_id: 1},
{entity_type: "Mac", entity_ids: [9,10], cascade_id: 4, location_id: 1}]
I want to know how to merge hashes if it's one or two key-value pair are same, merging other key's values to an array.
-> This is one more instance
arr2 = [
{entity_type: "Sub", entity_ids: [7], mac_id: 5, cascade_id: 1, location_id: 1},
{entity_type: "Sub", entity_ids: [10], mac_id: 5, cascade_id: 1, location_id: 1},
{entity_type: "Sub", entity_ids: [4], mac_id: 2, cascade_id: 1, location_id: 1},
{entity_type: "Sub", entity_ids: [11], mac_id: 7, cascade_id: 2, location_id: 2}
]
desired output for this instance is
[{entity_type: "Sub", entity_ids: [7, 10], mac_id: 5, cascade_id: 1, location_id: 1},
{entity_type: "Sub", entity_ids: [4], mac_id: 2, cascade_id: 1, location_id: 1},
{entity_type: "Sub", entity_ids: [11], mac_id: 7, cascade_id: 2, location_id: 2}]
Upvotes: 1
Views: 2150
Reputation: 285
There are two separate challanges in your problem.
Problem 1:
To get any custom behaviour while merging you can pass a block to merge method. In your case you want to combine arrays for entity ids. This blocks takes key and left and right values. In your scenerio you want to combine arrays if key == :entity_ids.
one_entity.merge(other_entity){ |key, left, right|
key== :entity_ids ? left + right : left
}
Problem 2:
To merge entities based on whether they have different attributes or same, i am using group_by. This will give me a hash combining entities that can be merged into array that i can map over and merge.
actual.group_by {|x| [x[:entity_type], x[:mac_id], x[:location_id]]}
Combining the two will give me the whole solution which works. You can add more attributes in group_by block if you want.
actual.group_by {|x| [x[:entity_type], x[:mac_id], x[:location_id]]}
.map{|_, entities| entities.reduce({}) { |result, entity|
result.merge(entity){|key, left, right|
key== :entity_ids ? left + right : left
}
}
}
Upvotes: 0
Reputation: 110645
You can compute the desired result as follows.
def doit(arr)
arr.each_with_object({}) do |g,h|
h.update(g.reject { |k,_| k==:entity_ids }=>g) do |_,o,n|
o.merge(entity_ids: o[:entity_ids] + n[:entity_ids])
end
end.values
end
doit(arr1)
#=> [{:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1},
# {:entity_type=>"Mac", :entity_ids=>[9, 10], :cascade_id=>4, :location_id=>1}]
doit(arr2)
#=> [{:entity_type=>"Sub", :entity_ids=>[7, 10], :mac_id=>5, :cascade_id=>1,
# :location_id=>1},
# {:entity_type=>"Sub", :entity_ids=>[4], :mac_id=>2, :cascade_id=>1,
# :location_id=>1},
# {:entity_type=>"Sub", :entity_ids=>[11], :mac_id=>7, :cascade_id=>2,
# :location_id=>2}]
This uses the form of Hash#update (aka merge!
) that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for an explanation of the block variables k
, o
and n
.
If doit
's argument is arr1
, the steps are as follows.
arr = arr1
e = arr.each_with_object({})
#=> #<Enumerator: [{:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2,
# :location_id=>1},
# {:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2,
# :location_id=>1},
# {:entity_type=>"Mac", :entity_ids=>[9], :cascade_id=>4,
# :location_id=>1},
# {:entity_type=>"Mac", :entity_ids=>[10], :cascade_id=>4,
# :location_id=>1}
# ]:each_with_object({})>
The first element of the enumerator is passed to the block and values are assigned to the block variables.
g, h = e.next
#=> [{:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}, {}]
g #=> {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}
h #=> {}
Compute the (only) key for the hash to be merged with h
.
a = g.reject { |k,_| k==:entity_ids }
#=> {:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}
Perform the update operation.
h.update(a=>g)
#=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
# {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}}
This is the new value of h
. As h
(which was empty) did not have the key
{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}
the block was not used to determine the value of this key in the merged hash.
Now generate the next value of the enumerator e
, pass it to the block, assign values to the block variables and perform the block calculation.
g, h = e.next
#=> [{:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1},
# {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
# {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}}]
g #=> {:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1}
h #=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
# {:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1}}
a = g.reject { |k,_| k==:entity_ids }
#=> {:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}
h.update(a=>g) do |_,o,n|
puts "_=#{_}, o=#{o}, n=#{n}"
o.merge(entity_ids: o[:entity_ids] + n[:entity_ids])
end
#=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
# {:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1}}
This is the new value of h
. As both g
and h
have the key a
the block is consulted to obtain the value of that key in the merged hash (new h
). The values of that block variables are printed.
_={:entity_type=>"Mac", :cascade_id=>2, :location_id=>1},
o={:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1},
n={:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1}
h[:entity_ids]
is therefore replaced with
o[:entity_ids] + o[:entity_ids]
#=> [3] + [2] => [3, 2]
The calculations for the two remaining elements of e
are similar, at which time
h #=> {{ :entity_type=>"Mac", :cascade_id=>2, :location_id=>1 }=>
# { :entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1 },
# { :entity_type=>"Mac", :cascade_id=>4, :location_id=>1 }=>
# { :entity_type=>"Mac", :entity_ids=>[9, 10], :cascade_id=>4, :location_id=>1 }}
The final step is to return the values of this hash.
h.values
#=> <as shown above>
Note that some of the block variables are underscores (_
). Though they are valid local variables, they are commonly used to indicate that they are not used in the block calculation. An alternative convention is to have the unused block variable begin with an underscore, such as _key
.
Upvotes: 2
Reputation: 5213
This will work:
def combine(collection)
return [] if collection.empty?
grouping_key = collection.first.keys - [:entity_ids]
grouped_collection = collection.group_by do |element|
grouping_key.map { |key| [key, element[key]] }.to_h
end
grouped_collection.map do |key, elements|
key.merge(entity_ids: elements.map { |e| e[:entity_ids] }.flatten.uniq)
end
end
Here's what's going on:
First we determine a "grouping key" for the collection by sampling the keys of the first element and removing :entity_ids. All other keys combined make up the grouping key on which the combination depends.
The Enumerable#group_by
method iterates over a collection and groups it by the grouping key we just constructed.
We then iterate over the grouped collection and merge in a new entity_ids attribute made up of the combined entity ids from each group.
Upvotes: 2