Reputation: 23

make it simpler so it will be quicker

I am quite new and got an issue with an exercise. I solved it, works, but it's too slow when it comes to check more numbers-maximum length aprox. = 1.000.000 . How could it been written for a quicker solving?

def find_dups_miss(arr)
    ((arr.sort.first..arr.sort.last).to_a - arr.sort)  + [arr.select{|item| arr.count(item) > 1}.uniq.sort]
end

testing:

arr1 = [10,9,8,9,6,1,2,4,3,2,5,5,3]
    Test.assert_equals(find_dups_miss(arr1),[7, [2, 3, 5, 9]])

It needs to find the missing number and the duplicates.

the error message: Why did my code time out? Our servers are configured to only allow a certain amount of time for your code to execute. In rare cases the server may be taking on too much work and simply wasn't able to run your code efficiently enough. Most of the time though this issue is caused by inefficient algorithms. If you see this error multiple times you should try to optimize your code further.

Upvotes: 1

Answers (3)

Cary Swoveland

Reputation: 110725

We are given an array of integers, arr with the property that it contains every integer between min_val and max_val but one, where min_val, max_val = arr.minmax. We wish to determine the missing integer and also the duplicate values in arr.

require 'set'

def missing_and_dups(arr)
  smallest, largest = arr.minmax
  dups = Set.new
  all = arr.each_with_object(Set.new) { |n,all| dups << n if all.add?(n).nil? }
  [(smallest+largest)*(largest-smallest+1)/2 - all.sum, dups.to_a]
end

missing_and_dups [10,9,8,9,6,1,2,4,3,2,5,5,3]
  #=> [7, [9, 2, 5, 3]]

Note that Set#add? returns nil if the element being added is already in the set. Rather than finding the missing element n with

((smallest..largest).to_a - arr).first

I've made use of the fact that

all.sum + n = (smallest+largest)*(smallest+largest-1)/2

Upvotes: 3

engineersmnky

Reputation: 29478

This is about as fast as I can get this problem for now

def find_dups_miss(arr)
  groups = arr.group_by(&:itself) 
  arr.minmax.reduce(:upto).to_a - groups.keys << groups.select {|_,v| v.size > 1}.keys.sort
end

Explanation based on posted Array

First we group the Array elements by themselves

{10=>[10], 9=>[9, 9], 8=>[8], 6=>[6], 1=>[1], 2=>[2, 2], 4=>[4], 3=>[3, 3], 5=>[5, 5]}

Then we create an Enumerator from the minimum and maximum (arr.minmax.reduce(:upto)) values from the Array, covert it to an Array (to_a) and subtract all the keys from the previous group grouping:

arr.minmax.reduce(:upto).to_a - groups.keys
#=> [7]

Then we collect of all the numbers that occurred more than once in the original Array: (I sorted them because the desired result was sorted)

groups.select {|_,v| v.size > 1}.keys.sort
#=> [2, 3, 5, 9]

and use Array#<< to push this Array back into the one we created in the previous step resulting in

#=> [7,[2, 3, 5, 9]]

If there is only ever 1 number missing then the following is slightly faster since it does not create an additional Array and short circuits on the first missing number:

def find_dups_miss(arr)
  groups = arr.group_by(&:itself) 
  [groups.select {|_,v| v.size > 1}.keys.sort].unshift(arr.minmax.reduce(:upto).find {|n| groups[n].nil?} )
end

Additionally for a very large Array:

groups.collect {|k,v| k if v.size > 1 }.compact.sort

appears to be slightly more efficient than

groups.select {|_,v| v.size > 1}.keys.sort

Upvotes: 1

Anthony

Reputation: 15967

I agree this is better put on code review, but to answer your question there are better data structures for solving this problem, consider using as hash:

def find_dups_missing(arr)
  min, max = arr.min, arr.max
  hash = {}
  min.upto(max) { |i| hash[i] = :sentinel }
  arr.each { |el| hash[el] == :sentinel ? hash[el] = 1 : hash[el] += 1 }

  hash.select { |_, v| v == :sentinel }.keys << hash.select { |_, v| v != :sentinel && v > 1 }.keys
end

We loop through, build a hash where each key is a number from min to max and the value points to an object that is just a placeholder (I called it sentinel).

Next we loop through our array, we ask if the hash value is still in its placeholder position, if it is, set the value to 1 but if it's not, just increment. This way, we keep track of when we see a value for the first time vs. subsequent times (ie dupes).

Then, after it's all said in done, we have a hash that looks like this:

{1=>1, 2=>2, 3=>2, 4=>1, 5=>2, 6=>1, 7=>:sentinel, 8=>1, 9=>2, 10=>1}

We know where the value is > 1, we have duplicates and we also know where the value still says sentinel we never saw it in our array, there's our gap(s).

All together, this method runs in O(n) time (on average) with O(n) space.

Upvotes: 0

make it simpler so it will be quicker

Answers (3)

Related Questions