Ruby group array of arrays to merge them into one single array based on unique key in arrays

Question

I tried other options available in other stack overflow requests but couldn't get the result I was looking for.

I have an array of arrays in it as below:

Input:

[["ABC", "5A2", nil, "88474"],
 ["ABC", nil, "2", "88474"],
 ["ABC", nil, nil, "88474"],
 ["ABC", nil, nil, "88474"],
 ["Jack", "5A2", nil, "05195"],
 ["Jack", nil, "2", "05195"],
 ["Jack", nil, nil, "05195"],
 ["Jack", nil, nil, "05195"]]

Array index 0 ABC or Jack will be used as group_by condition and I want the output as below where all ABC arrays are merged to show the values and removes the nil if any of the array holds a value in that index position:

Output:

[["ABC", "5A2", "2", "88474"],
["Jack", "5A2", "2", "05195"]]

It won't be always the same format as input where first element second value follows by second element third value. It can be changing but second value wont be set twice in multiple elements for same index 0 with different values and same applies for third or if I add fourth or fifth elements as well.

I have worked with array of hash but not array of arrays so not sure how to do it.

Cary Swoveland · Accepted Answer

I'm not certain that I understand the question but I expect you may be looking for the following.

arr = [
  ["ABC", "5A2", nil, "88474"],
  ["ABC", nil, "2", "88474"],
  ["ABC", nil, nil, "88474"],
  ["ABC", nil, nil, "88474"],
  ["Jack", "5A2", nil, "05195"],
  ["Jack", nil, "2", "05195"],
  ["Jack", nil, nil, "05195"],
  ["Jack", nil, nil, "05195"]
]

arr.each_with_object({}) do |a, h|
  h.update(a.first=>a) { |_k, oa, na| oa.zip(na).map { |ov, nv| ov.nil? ? nv : ov } }
end.values 
  #=> [["ABC", "5A2", "2", "88474"], ["Jack", "5A2", "2", "05195"]]

This uses the form of Hash#update (a.k.a. merge!) that employs the block

{ |_k, oa, na| oa.zip(na).map { |ov, nv| ov.nil? ? nv : ov } }

to determine the values of keys that are present in both the hash being built (h) and the hash being merged ({ a.first=>a }). See the doc for a description of the three block variables, _k, oa and na.¹

I can best explain how the calculations procede by salting the code with puts statements and running it with an abbreviated array arr.

arr = [
  ["ABC", "5A2", nil, "88474"],
  ["ABC", nil, "2", "88474"],
  ["Jack", "5A2", nil, "05195"],
  ["Jack", nil, "2", "05195"],
]

arr.each_with_object({}) do |a, h|
  puts "
a = #{a}"
  puts "h = #{h}"
  puts "a.first=>a = #{a.first}=>#{a}"
  h.update(a.first=>a) do |_k, oa, na|
    puts "_k = #{_k}"
    puts "oa = #{oa}"
    puts "na = #{na}"
    a = oa.zip(na)
    puts "oa.zip(na) = #{a}"
    a.map do |ov, nv|
      puts "  ov = #{ov}, nv = #{nv}"
      puts "  ov.nil? ? nv : ov = #{ov.nil? ? nv : ov}"
      ov.nil? ? nv : ov
    end
  end
end.tap { |h| puts "h = #{h}" }.values 
  #=> [["ABC", "5A2", "2", "88474"], ["Jack", "5A2", "2", "05195"]]

The following is displayed.

a = ["ABC", "5A2", nil, "88474"]
h = {}
a.first=>a = ABC=>["ABC", "5A2", nil, "88474"]
(The block is not called here because h does not have a key "ABC")

a = ["ABC", nil, "2", "88474"]
h = {"ABC"=>["ABC", "5A2", nil, "88474"]}
a.first=>a = ABC=>["ABC", nil, "2", "88474"]
_k = ABC
oa = ["ABC", "5A2", nil, "88474"]
na = ["ABC", nil, "2", "88474"]
oa.zip(na) = [["ABC", "ABC"], ["5A2", nil], [nil, "2"], ["88474", "88474"]]
  ov = ABC, nv = ABC
  ov.nil? ? nv : ov = ABC
  ov = 5A2, nv = 
  ov.nil? ? nv : ov = 5A2
  ov = , nv = 2
  ov.nil? ? nv : ov = 2
  ov = 88474, nv = 88474
  ov.nil? ? nv : ov = 88474

a = ["Jack", "5A2", nil, "05195"]
h = {"ABC"=>["ABC", "5A2", "2", "88474"]}
a.first=>a = Jack=>["Jack", "5A2", nil, "05195"]
(The block is not called here because h does not have a key "Jack")

a = ["Jack", nil, "2", "05195"]
h = {"ABC"=>["ABC", "5A2", "2", "88474"], "Jack"=>["Jack", "5A2", nil, "05195"]}
a.first=>a = Jack=>["Jack", nil, "2", "05195"]
_k = Jack
oa = ["Jack", "5A2", nil, "05195"]
na = ["Jack", nil, "2", "05195"]
oa.zip(na) = [["Jack", "Jack"], ["5A2", nil], [nil, "2"], ["05195", "05195"]]
  ov = Jack, nv = Jack
  ov.nil? ? nv : ov = Jack
  ov = 5A2, nv = 
  ov.nil? ? nv : ov = 5A2
  ov = , nv = 2
  ov.nil? ? nv : ov = 2
  ov = 05195, nv = 05195
  ov.nil? ? nv : ov = 05195

h = {"ABC"=>["ABC", "5A2", "2", "88474"], "Jack"=>["Jack", "5A2", "2", "05195"]}

^{1. As is common practice, I began the name of the common key, _k, with an underscore to signal to the reader that it is not used in the block calculation. Often you will see such a block variable represented by an underscore alone.}

Ruby group array of arrays to merge them into one single array based on unique key in arrays

Answers (2)

This is the final one liner

How it works

Related Questions