Cosi
Cosi

Reputation: 75

Intersection of two dimensional array

Is there a simple way to find the intersection of a two dimensional array? For example:

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr1,arr3]

I know that it's possible to do:

intersection = arr1 & arr2 & arr3 # => 5
intersection = big_arr[0] & big_arr[1] & big_arr[2] # => 5

but the number of elements in big_arr will vary. I was wondering if there was a simple way to intersect all the elements in big_arr regardless of the number of elements.

Upvotes: 3

Views: 994

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110675

What do you want: a method with a pretty face or one that is first to finish line? My friend @Arup has supplied one; I'll offer the another.

Code

def heavy_lifter(a)
  wee_one = a.min_by(&:size)
  return [] if wee_one.empty?
  wee_loc = a.index(wee_one)
  counts = wee_one.each_with_object({}) { |e,h| h.update(e=>1) }
  nbr_reqd = 1
  a.each_with_index do |b,i|
    next if i == wee_loc
    b.each do |e|
      cnt = counts[e]
      case
      when cnt.nil?
        next
      when cnt == nbr_reqd
        counts[e] = cnt + 1
      when cnt < nbr_reqd
        counts.delete(e)
        return [] if counts.empty?
      end
    end
    nbr_reqd += 1
  end
  counts.keys.each { |k| counts.delete(k) if counts[k] < nbr_reqd }
  counts.keys
end

Example

a1 = [1,2,3,4,5]
a2 = [5,6,7,8]
a3 = [5]
a = [a1,a2,a3]

heavy_lifter(a)
  #=> [5]

Explanation

Here's how the method works:

  • select the smallest array (wee_one). To simplify the explanation, assume it is the first element of a.
  • convert wee_one to a counting hash, counts, where counts[e] = 1 for each element of wee_one.
  • iterate through the remaining arrays.
  • keys of counts will be removed as arrays are processed.
  • after all calculations are complete, counts.keys equals the intersection of all arrays.
  • after nbr_reqd arrays have been processed (including wee_one), counts[k] equals the number of those arrays that have been found to contain k. Obviously, if counts[k] < nbr_reqd, key k can be removed from counts (but we will not remove such keys until our attention is drawn to them, or at the end).
  • suppose we are now to process the array b at offset nbr_reqd, meaning nbr_reqd arrays have been processed (including wee_one at offset zero). For each element e of b, we obtain cnt = counts[e]. There are four possibilities:
    • cnt == nil, in which case there is nothing to be done;
    • cnt < nbr_reqd, in which case key e is removed from counts;
    • cnt == nbr_reqd, meaning e has been present in all previous arrays processed, in which case we execute counts[k] = cnt + 1; and
    • cnt == nbr_read+1, meaning e has been present in all previous arrays processed and is a duplicate of another e in b that has already been processed, in which case nothing is to be done.
  • nbr_reqd is incremented by one and the process is repeated for the next array.
  • after all arrays have been processed, all that remains is to remove each key k in counts for which counts[k] < nbr_reqd.

Cutie method

def cutie(a)
  a.reduce(:&)
end

Test data

def test(mx, *sizes)
  sizes.map { |sz| Array.new(sz) { rand(mx) } }
end

For example:

test(10,5,6,7)
  #=> [[9, 1, 5, 1, 1], [0, 8, 7, 8, 5, 0], [5, 1, 7, 6, 7, 9, 5]]

Benchmark code

require 'benchmark'

def bench(tst)
  Benchmark.bm(12) do |bm|
    bm.report 'cutie' do
      cutie(tst)
    end
    bm.report 'heavy_lifter' do
      heavy_lifter(tst)
    end
  end
end

Benchmark results

tst = test(1_000_000, 400_000, 600_000, 800_000) 

cutie(tst).size
  #=> 81929
cutie(tst).sort == heavy_lifter(tst).size
  #=> true

bench(tst)
                   user     system      total        real
cutie          1.610000   0.030000   1.640000 (  1.639736)
heavy_lifter   1.800000   0.020000   1.820000 (  1.824281)

sizes = (700_000..890_000).step(10_000).to_a
  #=> [700000, 710000, 720000, 730000, 740000,
  #    750000, 760000, 770000, 780000, 790000,
  #    800000, 810000, 820000, 830000, 840000,
  #    850000, 860000, 870000, 880000, 890000]

tst = test(1_000_000, *sizes) 

bench(tst)
                   user     system      total        real
cutie         14.090000   0.440000  14.530000 ( 14.679101)
heavy_lifter   5.830000   0.030000   5.860000 (  5.935438)

Upvotes: 0

Arup Rakshit
Arup Rakshit

Reputation: 118271

Use #reduce like

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr2,arr3]
bigarr.reduce(:&) # => [5]

Upvotes: 7

Related Questions