Reputation: 795
I have two arrays that have thousands of elements. I need to find what elements are missing in one array by comparing it to another. Is there a way to get the missing elements without iterating through the entire array? Or is there something faster than what I am doing?
Here is what I am using now:
def find_missing(array1, array2)
missing_elements = []
array1.each { |e|
unless array2.include? e
missing_elements << e
end
}
return missing_elements
end
array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
array2 = [1, 2, 4, 5, 6, 7, 9]
puts find_missing(array1, array2)
Upvotes: 2
Views: 2990
Reputation: 639
To add to the original accepted answer, it is worth noting that the below solution would ensure finding the difference between two arrays, irrelevant of the order of comparison.
array1 - array2 | array2 - array1
# => [10, 8, 3]
If for some reason you were to compare the two arrays the other way round, it would return empty array and give the illusion nothing is missing. This may be obvious for small arrays but not for much larger ones.
array2 - array1
# => []
However a caveat to this is if you have differences in both arrays, this would capture the collective difference between them.
array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
array2 = [1, 2, 4, 5, 6, 7, 9, 11]
array1 - array2 | array2 - array1
# => [11, 10, 8, 3]
Upvotes: 0
Reputation: 110665
I doubt that you can beat array1 - array2
, Array#-
being coded in C, but you might want to benchmark it against:
require 'set'
def use_sets(array1, array2)
a1 = array1.uniq
s2 = array2.to_set
a1.reject { |e| s2.include?(e) }
end
added..
require 'fruity'
def compare_em(array1, array2)
compare do
_minus { array1 - array2 }
_sets { use_sets(array1, array2) }
end
end
n = 100_000
array1 = (1..n).to_a.shuffle
array2 = (1..n-2).to_a.shuffle
compare_em(array1, array2)
#_minus is faster than _sets by 2x ± 1.0
n = 1_000_000
array1 = (1..n).to_a.shuffle
array2 = (1..n-2).to_a.shuffle
compare_em(array1, array2)
#_minus is faster than _sets by 2x ± 1.0
n = 100_000
array1 = (([1]*n).concat [1,2]).shuffle
array2 = [1]*n
compare_em(array1, array2)
#_minus is faster than _sets by 5x ± 1.0
Upvotes: 1
Reputation: 122373
You want a copy of the first array, but removing any elements that appears in the second array? That's what Array#-
(array difference) does:
array1 - array2
# => [10, 8, 3]
Upvotes: 8
Reputation: 5107
You can just use the difference operator
@irb(main):001:0> array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
=> [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
@irb(main):002:0> array2 = [1, 2, 4, 5, 6, 7, 9]
=> [1, 2, 4, 5, 6, 7, 9]
@irb(main):003:0> array1 - array2
=> [10, 8, 3]
Upvotes: 2