Reputation: 599
I am working on a program that will compare two .csv files. After extracting the relevant data from one of the csv files into an array of arrays, I need to combine related entries. For example, I would want to turn this array:
[["11/13/15", ["4001", "1392"], "INBOUND"],
["11/13/15", ["4090", "540"], "INBOUND"],
["11/13/15", ["1139", "162"], "INBOUND"],
["11/13/15", ["1158", "64"], "INBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["4055", "448"], "OUTBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["1139", "162"], "OUTBOUND"],
["11/13/15", ["1158", "64"], "OUTBOUND"],
["11/13/15", ["4091", "520"], "OUTBOUND"]]
into this:
[["11/13/15", ["4001", "1392"], "INBOUND"],
["11/13/15", ["4090", "540"], "INBOUND"],
["11/13/15", ["1139", "162"], "INBOUND"],
["11/13/15", ["1158", "64"], "INBOUND"],
["11/13/15", ["4055", "1152"], "OUTBOUND"],
["11/13/15", ["1139", "162"], "OUTBOUND"],
["11/13/15", ["1158", "64"], "OUTBOUND"],
["11/13/15", ["4091", "520"], "OUTBOUND"]]
For some element of the array, if its items at [0]
, [1][0]
, and [2]
match those of another one, then create a new item (array) with its item at [1][1]
being the sum of all the items at [1][1]
and delete the old arrays. If it would be easier, I can change the way the relevant data is extracted so that the item at [1]
is not an array and each row has 4 items instead of 3.
Upvotes: 0
Views: 143
Reputation: 2820
And just for example - my one-liner (works with both 1.8 and 1.9 rubies):
table = [["11/13/15", ["4001", "1392"], "INBOUND"],
["11/13/15", ["4090", "540"], "INBOUND"],
["11/13/15", ["1139", "162"], "INBOUND"],
["11/13/15", ["1158", "64"], "INBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["4055", "448"], "OUTBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["1139", "162"], "OUTBOUND"],
["11/13/15", ["1158", "64"], "OUTBOUND"],
["11/13/15", ["4091", "520"], "OUTBOUND"]]
result = table.group_by {|a, (b, c), d| [a, [b], d]}.map {|k, v| k[1] << v.map {|a| a[1][1].to_i}.inject(:+).to_s; k}
Upvotes: 2
Reputation: 168101
h = Hash.new(0)
[["11/13/15", ["4001", "1392"], "INBOUND"],
["11/13/15", ["4090", "540"], "INBOUND"],
["11/13/15", ["1139", "162"], "INBOUND"],
["11/13/15", ["1158", "64"], "INBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["4055", "448"], "OUTBOUND"],
["11/13/15", ["4055", "352"], "OUTBOUND"],
["11/13/15", ["1139", "162"], "OUTBOUND"],
["11/13/15", ["1158", "64"], "OUTBOUND"],
["11/13/15", ["4091", "520"], "OUTBOUND"]]
.each{|a, (b, c), d| h[[a, b, d]] += c.to_i}
p h.map{|(a, b, d), c| [a, [b, c], d]}
will give:
[["11/13/15", ["4001", 1392], "INBOUND"],
["11/13/15", ["4090", 540], "INBOUND"],
["11/13/15", ["1139", 162], "INBOUND"],
["11/13/15", ["1158", 64], "INBOUND"],
["11/13/15", ["4055", 1152], "OUTBOUND"],
["11/13/15", ["1139", 162], "OUTBOUND"],
["11/13/15", ["1158", 64], "OUTBOUND"],
["11/13/15", ["4091", 520], "OUTBOUND"]]
Upvotes: 0
Reputation: 1094
This should do it:
def lookup(list, id, direction)
index = nil
list.each_with_index do |e, i|
if (id == e[1][0]) and (e[2] == direction)
index = i
break
end
end
index
end
b = []
a.each do |e|
id = e[1][0]
direction = e[2]
i = lookup(b, id, direction)
if i.nil?
b << e
else
count = e[1][1].to_i
sum = count + b[i][1][1].to_i
b[i][1][1] = sum.to_s
end
end
b.each{|e| p e}
Output:
["11/13/15", ["4001", "1392"], "INBOUND"]
["11/13/15", ["4090", "540"], "INBOUND"]
["11/13/15", ["1139", "162"], "INBOUND"]
["11/13/15", ["1158", "64"], "INBOUND"]
["11/13/15", ["4055", "1152"], "OUTBOUND"]
["11/13/15", ["1139", "162"], "OUTBOUND"]
["11/13/15", ["1158", "64"], "OUTBOUND"]
["11/13/15", ["4091", "520"], "OUTBOUND"]
Upvotes: 0
Reputation: 67860
I assume that the elements to group are consecutive so we can use Enumerable#chunk
. Functional approach:
grouped_xs = xs.chunk { |date, (id1, id2), direction| [date, id1, direction] }
grouped_xs.map do |(date, id1, direction), ary|
id2_sum = ary.map { |date, (id1, id2), direction| id2.to_i }.inject(:+)
[date, id1, id2_sum.to_s, direction]
end
Output (you wanted 4 elements in the output array, right?):
[["11/13/15", "4001", "1392", "INBOUND"],
["11/13/15", "4090", "540", "INBOUND"],
["11/13/15", "1139", "162", "INBOUND"],
["11/13/15", "1158", "64", "INBOUND"],
["11/13/15", "4055", "1152", "OUTBOUND"],
["11/13/15", "1139", "162", "OUTBOUND"],
["11/13/15", "1158", "64", "OUTBOUND"],
["11/13/15", "4091", "520", "OUTBOUND"]]
Upvotes: 2