mnort9
mnort9

Reputation: 1820

Merge hashes with multiple keys/values

I have an array of hashes of which I merge hashes that contain specific duplicate key value.

combined_keywords = new_array_of_hashes.each_with_object(Hash.new(0)){|oh, newh|
        newh[oh[:keyword]] += oh[:total_value].to_f
    }

This creates an array of hashes that look like this:

 { :ACTUAL_KEYWORD => ACTUAL_TOTAL_VALUE }

I'm new to ruby and I don't quite understand the magic behind this. I have an additional key and value to consolidate and now I'm lost. The root of the issue is I don't understand how the consolidation is occurring in this line: newh[oh[:keyword]] += oh[:total_value].to_f

I tried this with no luck:

combined_keywords = new_array_of_hashes.each_with_object(Hash.new(0)){|oh, newh|
            newh[oh[:keyword]] += oh[:total_value].to_f
            newh[oh[:keyword]] += oh[:revenue_per_transaction].to_f
        }

I really just need an array of consildated hashes each similar to:

{ :keyword => "ACTUAL_KEYWORD", :total_value => ACTUAL_TOTAL_VALUE, :revenue_per_transaction => ACTUAL_REVENUE }

Edit:

Input

new_array_of_hashes = [
  { keyword: 'foo', total_value: 1, revenue_per_transaction: 5 },
  { keyword: 'bar', total_value: 2, revenue_per_transaction: 4 },
  { keyword: 'bar', total_value: 4, revenue_per_transaction: 4 },
  { keyword: 'foo', total_value: 3, revenue_per_transaction: 5 },
]

Desired Output

combined_keywords = [
  { keyword: 'foo', total_value: 4, revenue_per_transaction: 10 },
  { keyword: 'bar', total_value: 6, revenue_per_transaction: 8 },
]

Upvotes: 1

Views: 1392

Answers (2)

Aaron K
Aaron K

Reputation: 6961

Let's say you have:

new_array_of_hashes = [
  { keyword: 'foo', total_value: 1 },
  { keyword: 'bar', total_value: 2 },
  { keyword: 'bar', total_value: 4 },
  { keyword: 'foo', total_value: 3 },
]

Now we'll step through your code:

combined_keywords = new_array_of_hashes.each_with_object(Hash.new(0)){|oh, newh|
  newh[oh[:keyword]] += oh[:total_value].to_f
}

This will loop over each hash in the array. We also setup a new hash which returns 0 if we access a key that doesn't exist:

# Pass 1
oh = { keyword: 'foo', total_value: 1 }
newh = {}
newh[ oh[:keyword] ]  #=> newh['foo'] This key doesn't exist and returns 0
oh[:total_value].to_f #=> 1.to_f => 1.0
newh[oh[:keyword]] += oh[:total_value].to_f
#=> newh['foo'] = newh['foo'] + oh[:total_value].to_f
#=> newh['foo'] = 0 + 1.0

# Pass 2
oh = { keyword: 'bar', total_value: 2 }
newh = { 'foo' => 1.0 }
newh[ oh[:keyword] ]  #=> newh['bar'] This key doesn't exist and returns 0
oh[:total_value].to_f #=> 2.to_f => 2.0
newh[oh[:keyword]] += oh[:total_value].to_f
#=> newh['bar'] = newh['bar'] + oh[:total_value].to_f
#=> newh['bar'] = 0 + 2.0

Now since we have keys for the next two iterations we access things as normal:

# Pass 3
oh = { keyword: 'bar', total_value: 4 }
newh = { 'foo' => 1.0, 'bar' => 2.0 }
newh[ oh[:keyword] ]  #=> newh['bar'] This key now exists and returns 2.0
oh[:total_value].to_f #=> 4.to_f => 4.0
newh[oh[:keyword]] += oh[:total_value].to_f
#=> newh['bar'] = newh['bar'] + oh[:total_value].to_f
#=> newh['bar'] = 2.0 + 4.0

# Pass 4
oh = { keyword: 'foo', total_value: 3 }
newh = { 'foo' => 1.0, 'bar' => 6.0 }
newh[ oh[:keyword] ]  #=> newh['foo'] This key now exists and returns 1.0
oh[:total_value].to_f #=> 3.to_f => 3.0
newh[oh[:keyword]] += oh[:total_value].to_f
#=> newh['foo'] = newh['foo'] + oh[:total_value].to_f
#=> newh['foo'] = 1.0 + 3.0

When the block returns it will return newh; this is how each_with_object works.

As you can see, what is returned is a hash of the form:

{ 'foo' => 4.0, 'bar' => 6.0 }

So this is only a combined array where the new key was the stored :keyword object, and the value was the sum total.

Based on your new hash form

{
  keyword: "ACTUAL_KEYWORD",
  total_value: ACTUAL_TOTAL_VALUE,
  revenue_per_transaction: ACTUAL_REVENUE
}

This format won't make much sense. Since hashes only have key:value pairs. You may need to have a hash of sub-hashes, or run through the loop twice. Once for :total_value and once for :revenue_per_transaction. It will really depend what you want your final object(s) to be.

Edit:

Based on your new expected input and output, you could use:

sum_keys = [:total_value, :revenue_per_transaction]
new_array_of_hashes.group_by{ |h| h[:keyword] }
                   .map{ |keyword, related|
                     tmp = {keyword: keyword}
                     tmp.merge! Hash[sum_keys.zip Array.new(sum_keys.size, 0)]
                     related.reduce(tmp){ |summed, h|
                       sum_keys.each{ |key| summed[key] += h[key] }
                       summed
                     }
                   }

#=> [
#  { keyword: 'foo', total_value: 4, revenue_per_transaction: 10 },
#  { keyword: 'bar', total_value: 6, revenue_per_transaction: 8 },
#]

It's a bit messy. I'd probably refactor what the map call is doing into it's own helper method. The reason I'm providing a start value to reduce is because otherwise it will mutate the original hash from new_array_of_hashes.

Upvotes: 3

Tom L
Tom L

Reputation: 3409

Given

foos = [ { :key => 'Foo', :value => 1, :revenue => 2 },
         { :key => 'Foo', :value => 4, :revenue => 8 } ]

You can do this

foos.each_with_object(Hash.new(0)) do |foo_hash, new_hash|
  new_hash[:keyword] = foo_hash[:key]
  new_hash[:total_value] += foo_hash[:value]
  new_hash[:total_revenue] += foo_hash[:revenue]
end

So each_with_object allows you to pass an argument to an enumerable's .each block. In this case you are passing Hash.new(0). The 0 argument is a way to set a default hash value so you don't have to explicitly zero the value to zero in the loop and just get right to incrementing. The += is just shorthand. So a += b is equivalent to a = a + b.

The clunky thing about the loop is it sets the new_hash[:keyword] value on every pass. You could tack on a if new_hash[:keyword] == 0 (because it initials to zero) but that's just a bandaid. The problem lies in the original hashes structure. If :key aways equals 'Foo' then 'Foo' is superfluous. If it's not always 'Foo' then this loop isn't very useful.

The loop above yields

{ :keyword => 'Foo', :total_value => 5, :total_revenue => 10 }

Upvotes: 0

Related Questions