segue_segway
segue_segway

Reputation: 1518

Using heap in ruby algorithm

Working on the following algorithm:

Given a non-empty array of integers, return the k most frequent elements.

For example, Given [1,1,1,2,2,3] and k = 2, return [1,2].

Note: You may assume k is always valid, 1 ≤ k ≤ number of unique elements. Your algorithm's time complexity must be better than O(n log n), where n is the array's size.

My initial impulse is to use a hash table for number as the key and the value as the number of occurrences. Then, I can insert each key-value pair as a node into a maxHeap and simply delete max until k == 0.

Is building a node and inputting into a maxHeap the right way to solve with that approach? Note I'm not curious about a more optimal solution-- curious about whether that's the way I'd implement the idea of using maxHeap to repeatedly find the number with the maximum number of occurrences. The node part seems like overkill but I'm not sure how else to do it.

Upvotes: 5

Views: 5635

Answers (2)

OneNeptune
OneNeptune

Reputation: 913

Answer:

code:

input = [1,1,1,2,2,3]
k = 2

def most_frequent(arr, num = 1)
  arr.group_by(&:itself).sort_by {|_,s| -s.length}.first(num).map(&:first)
end

most_frequent(input, k)

output:

=> [1, 2]

MaxHeap Answer

require "rubygems"
require "algorithms"

include Containers

input = [1,1,1,2,2,3] 
k = 2

def heap_most_frequent(arr, num = 1)
  max_heap = MaxHeap.new(arr.group_by(&:itself).map{|n,ns| [ns.count,n]})
  (1..num).each_with_object([]) { |i, result| result << max_heap.pop[1]}
end

Benchmark:

            user     system      total        real
orig:   0.050000   0.000000   0.050000   (0.057158)
heap:   0.110000   0.000000   0.110000   (0.112387)

Summary

Most of the work goes in to making the hash, the Heap in this instance just complicates things when dealing with key, value pairs.

Upvotes: 5

tadman
tadman

Reputation: 211560

You can always do this with a few O(n) transforms with one hash-table intermediate:

def max_in_list(list, n = 1)
  list.group_by { |v| v }.sort_by do |_, s|
    -s.length
  end.first(n).map(&:first)
end

numbers = [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6 ]

max_in_list(numbers, 2)
# => [1, 2, 4]

Unfortunately max_by doesn't respect ordering when requesting more than one entry. It just gives the top N without any concern for rank.

Upvotes: 2

Related Questions