Reputation: 23
I am quite new and got an issue with an exercise. I solved it, works, but it's too slow when it comes to check more numbers-maximum length aprox. = 1.000.000 . How could it been written for a quicker solving?
def find_dups_miss(arr)
((arr.sort.first..arr.sort.last).to_a - arr.sort) + [arr.select{|item| arr.count(item) > 1}.uniq.sort]
end
testing:
arr1 = [10,9,8,9,6,1,2,4,3,2,5,5,3]
Test.assert_equals(find_dups_miss(arr1),[7, [2, 3, 5, 9]])
It needs to find the missing number and the duplicates.
the error message: Why did my code time out? Our servers are configured to only allow a certain amount of time for your code to execute. In rare cases the server may be taking on too much work and simply wasn't able to run your code efficiently enough. Most of the time though this issue is caused by inefficient algorithms. If you see this error multiple times you should try to optimize your code further.
Upvotes: 1
Views: 86
Reputation: 110725
We are given an array of integers, arr
with the property that it contains every integer between min_val
and max_val
but one, where min_val, max_val = arr.minmax
. We wish to determine the missing integer and also the duplicate values in arr
.
require 'set'
def missing_and_dups(arr)
smallest, largest = arr.minmax
dups = Set.new
all = arr.each_with_object(Set.new) { |n,all| dups << n if all.add?(n).nil? }
[(smallest+largest)*(largest-smallest+1)/2 - all.sum, dups.to_a]
end
missing_and_dups [10,9,8,9,6,1,2,4,3,2,5,5,3]
#=> [7, [9, 2, 5, 3]]
Note that Set#add? returns nil
if the element being added is already in the set. Rather than finding the missing element n
with
((smallest..largest).to_a - arr).first
I've made use of the fact that
all.sum + n = (smallest+largest)*(smallest+largest-1)/2
Upvotes: 3
Reputation: 29478
This is about as fast as I can get this problem for now
def find_dups_miss(arr)
groups = arr.group_by(&:itself)
arr.minmax.reduce(:upto).to_a - groups.keys << groups.select {|_,v| v.size > 1}.keys.sort
end
Explanation based on posted Array
First we group the Array
elements by themselves
{10=>[10], 9=>[9, 9], 8=>[8], 6=>[6], 1=>[1], 2=>[2, 2], 4=>[4], 3=>[3, 3], 5=>[5, 5]}
Then we create an Enumerator
from the minimum and maximum (arr.minmax.reduce(:upto)
) values from the Array
, covert it to an Array
(to_a
) and subtract all the keys
from the previous group grouping:
arr.minmax.reduce(:upto).to_a - groups.keys
#=> [7]
Then we collect of all the numbers that occurred more than once in the original Array
: (I sorted them because the desired result was sorted)
groups.select {|_,v| v.size > 1}.keys.sort
#=> [2, 3, 5, 9]
and use Array#<<
to push this Array
back into the one we created in the previous step resulting in
#=> [7,[2, 3, 5, 9]]
If there is only ever 1 number missing then the following is slightly faster since it does not create an additional Array
and short circuits on the first missing number:
def find_dups_miss(arr)
groups = arr.group_by(&:itself)
[groups.select {|_,v| v.size > 1}.keys.sort].unshift(arr.minmax.reduce(:upto).find {|n| groups[n].nil?} )
end
Additionally for a very large Array
:
groups.collect {|k,v| k if v.size > 1 }.compact.sort
appears to be slightly more efficient than
groups.select {|_,v| v.size > 1}.keys.sort
Upvotes: 1
Reputation: 15967
I agree this is better put on code review, but to answer your question there are better data structures for solving this problem, consider using as hash:
def find_dups_missing(arr)
min, max = arr.min, arr.max
hash = {}
min.upto(max) { |i| hash[i] = :sentinel }
arr.each { |el| hash[el] == :sentinel ? hash[el] = 1 : hash[el] += 1 }
hash.select { |_, v| v == :sentinel }.keys << hash.select { |_, v| v != :sentinel && v > 1 }.keys
end
We loop through, build a hash where each key is a number from min
to max
and the value points to an object that is just a placeholder (I called it sentinel
).
Next we loop through our array, we ask if the hash value is still in its placeholder position, if it is, set the value to 1
but if it's not, just increment. This way, we keep track of when we see a value for the first time vs. subsequent times (ie dupes).
Then, after it's all said in done, we have a hash that looks like this:
{1=>1, 2=>2, 3=>2, 4=>1, 5=>2, 6=>1, 7=>:sentinel, 8=>1, 9=>2, 10=>1}
We know where the value is > 1
, we have duplicates and we also know where the value still says sentinel
we never saw it in our array, there's our gap(s).
All together, this method runs in O(n)
time (on average) with O(n)
space.
Upvotes: 0