Reputation: 5721
Right now I have this:
array1 = [obj1, obj2, obj3, obj4]
array2 = [obj1, obj2, obj5, obj6]
array1.each do |item1|
array2.each do |item2|
if (item1[0] == item2[0]) && (item1[1] == item2[1])
p "do stuff"
end
end
end
I need to match 2 pieces of data from each array but both arrays are very large and I'm wondering if there is a faster way to do this.
My current setup requires looking at each element in the second array for each element in the first array which seems terribly inefficient.
Upvotes: 3
Views: 161
Reputation: 6076
Combining map and intersection:
(array1.map { |a| a.first 2 } & array2.map { |a| a.first 2 }).each do
p "do_stuff"
end
Performance should be good. Memory intensive though.
Upvotes: 2
Reputation: 1705
This is very slow because - as DigitalRoss already stated - it is O(n^2). Assuming that eql? is just as fine for you as ==, you can build an index and iterate over that instead, that'll be O(n+m) instead:
array1 = [obj1, obj2, obj3, obj4]
array2 = [obj1, obj2, obj5, obj6]
index = {}
found = []
array1.each do |item1| index[item1.first(2)] = item1 end
array2.each do |item2|
item1 = index[item2.first(2)]
found << [item1,item2] if item1 then
end
found.each do |item1, item2| puts "do something" end
This assumes that the combination of the first 2 elements of all elements in array1 are unique within array1. If that's not the case, the code will be slightly more complex:
array1 = [obj1, obj2, obj3, obj4]
array2 = [obj1, obj2, obj5, obj6]
index = {}
found = []
array1.each do |item1|
key = item1.first(2)
index[key] ||= []
index[key] << item1
end
array2.each do |item2|
items_from_1 = index[item2.first(2)]
if items_from_1 then
found.concat(items_from_1.map { |item1| [item1,item2] })
end
end
found.each do |item1, item2| puts "do something" end
Since you didn't provide any sample data, I didn't test the code.
I hope that helps.
Upvotes: 1
Reputation: 44080
If the two arrays is all you have, you can't avoid the O(n^2) complexity that DigitalRoss mentioned. However, you can index the data so that the next time you don't have to do it all again. In the simplest case, you can use hashes to allow direct access to your data based on the criteria used in the test:
index1 = array1.each_with_object({}){|e, acc|
acc[[e[0], e[1]]] ||= []
acc[[e[0], e[1]]] << e
}
and the same thing for the other array. Then, your loop would look like:
index1.each do |key1, vals1|
if vals2 = index2[key1]
vals1.product(vals2).each do |e1, e2|
p do_stuff
end
end
end
which is, I believe, O(n).
Upvotes: 1