Reputation: 5
I'm currently using Ruby Version 1.8.7 and I've been searching around but unable to find a solution for this. I am currently working on creating a unique vendor identifier. But I'll simplify the issue here.
I have a 2D Array for vendors and products:
A = [["C"], ["A","D"], ["A","B"], ["B","C","E","F"], ["B","G","K"], [], ["H","I"], [], [], ["I"], ["J"]]
What I need to do here is to retrieve the top 5 users (Array index number) with the most number of unique product. In this case.
The top 5 vendors would be:
1 - A,D
3 - B,C,E,F
4 - B,G,K
6 - H,I
10 - J
Vendor 3 has Products ["B","C","E","F"]
but Vendor 4 has products ["B","G","K"].
Since vendor 4 and 3 both has ["B"]
Vendor 3 has 3 unique products ["C","E","F"]
Vendor 4 has 2 unique products ["G","K"]
What I need to return is an Array of Vendors (Based on their index in the 2D Array) of the Top 5 Vendors.
Here are my codes so far:
def test
vendors = [[C], [A,D], [A,B], [B,C,E,F], [B,G,K], [], [H,I], [], [], [I], [J]]
useridArr = Array(0..vendors.length-1)
vendors = inplace_quicksort(vendors, 0, vendors.length-1,useridArr)
getUnique(vendors,useridArr, vendors.length-1)
end
def partition_array(array, left, right, pivot_index, arr)
pivot_value = array[pivot_index].length
arr[pivot_index], arr[right] = arr[right], arr[pivot_index]
array[pivot_index], array[right] = array[right], array[pivot_index]
store_index = left
(left..right-1).each do |i|
if array[i].length < pivot_value
arr[i], arr[store_index] = arr[store_index], arr[i]
array[i], array[store_index] = array[store_index], array[i]
store_index = store_index + 1
end
end
arr[store_index], arr[right] = arr[right], arr[store_index]
array[store_index], array[right] = array[right], array[store_index]
return store_index
end
def inplace_quicksort(array, left, right, indexArr)
if left < right
pivot_index = (left + ((right - left) / 2)).to_i
new_pivot_index = partition_array(array, left, right, pivot_index,indexArr)
inplace_quicksort(array, left, new_pivot_index - 1,indexArr)
inplace_quicksort(array, new_pivot_index + 1, right,indexArr)
end
return array
end
def getUnique(vendors,useridArr, searchFor)
while searchFor != -1
p vendors.map {|a| a & vendors[searchFor] }
searchFor = searchFor - 1
end
end
Upvotes: 0
Views: 527
Reputation: 110675
Each of the 11
elements of A
corresponds to a vendor and there are (coincidentally) 11
products1:
A = [[:C], [:A, :D], [:A, :B], [:B, :C, :E, :F], [:B, :G, :K], [],
[:H, :I], [], [], [:I], [:J]]
products = A.flatten
#=> [:C, :A, :D, :A, :B, :B, :C, :E, :F, :B, :G, :K, :H, :I, :I, :J]
products.uniq
#=> [:C, :A, :D, :B, :E, :F, :G, :K, :H, :I, :J]
products.uniq.size
#=> 11
We begin by computing the number of instances of each product:
g = Hash.new(0)
counts = products.each_with_object(g) { |p,h| h[p] += 1 }
#=> {:C=>2, :A=>2, :D=>1, :B=>3, :E=>1, :F=>1, :G=>1, :K=>1,
# :H=>1, :I=>2, :J=>1}
g = Hash.new(0)
creates an empty hash with a default value of zero. That means that if g
does not have a key k
, g[k]
will return zero. Notice the expression h[p] += 1
. That is called an abbreviated assignment. It simply means the expression is expanded to:
h[p] = h[p] + 1
before it is evaluated. If h
does not have a key p
, h[p]
on the right side returns zero, so h[p]
is set equal to 0+1 #=> 1
.
Everything above would normally be written more compactly as follows:
counts = A.flatten.each_with_object(Hash.new(0)) { |p,h| h[p] += 1 }
The products offered by only one vendor are given by:
unique_products = counts.select { |_,count| count == 1 }.keys
#=> [:D, :E, :F, :G, :K, :H, :J]
The vendor at offset 3
in A
has two unique products, :E
and :F
:
[:B,:C,:E,:F] & unique_products
#=> [:E, :F]
That is:
([:B,:C,:E,:F] & unique_products).size
#=> 2
If we want the five vendors with the greatest numbers of unique products, ordered by decreasing numbers of unique products, we can do this:
A.sort_by { |a| -(a & unique_products).size }.first(5)
#=> [[:B, :G, :K], [:B, :C, :E, :F], [:H, :I], [:A, :D], [:J]]
In Ruby 2.2+ we can do this more directly, using Enumerable#max_by:
A.max_by(5) { |a| (a & unique_products).size }
#=> [[:B, :G, :K], [:B, :C, :E, :F], [:J], [:A, :D], [:H, :I]]
The ordering is slightly different, but that's because the last three vendors of the top five all have one unique product.
Wrapping this up, we might write a method as follows:
def max_unique_products(products_by_vendor, n)
counts = products_by_vendor.flatten.
each_with_object(Hash.new(0)) { |p,h| h[p] += 1 }
unique_products = counts.select { |_,count| count == 1 }.keys
products_by_vendor.max_by(n) { |a| (a & unique_products).size }
end
max_unique_products(A, 5)
#=> [[:B, :G, :K], [:B, :C, :E, :F], [:J], [:A, :D], [:H, :I]]
Edit 1: I forgot you wanted the indices of the top vendors. Just change the last line of the method above to:
products_by_vendor.each_with_index.
max_by(n) { |a,_| (a & unique_products).size }.map(&:last)
or:
products_by_vendor.each_with_index.
sort_by { |a,_| -(a & unique_products).size }.first(5).map(&:last)
and you will get:
max_unique_products(A, 5)
#=> [4, 3, 10, 1, 6]
Edit 2: To make this work with Ruby v1.8.7, try this:
def max_unique_products(products_by_vendor, n)
counts = products_by_vendor.flatten.
reduce(Hash.new(0)) { |h,p| h[p] += 1; h }
unique_products = counts.select { |_,count| count == 1 }.map(&:first)
products_by_vendor.each_with_index.
sort_by { |a,_| -(a & unique_products).size }.first(5).map(&:last)
end
It works with v2.2 and I believe that all the methods existed in v1.8.7.
1. The OP originally defined A as [[C], [A, D]...]. I changed that to [[:C], [:A, :D]...] in my answer. lostcoder then changed it to [["C"], ["A", "D"]...].
Upvotes: 2